Page MenuHomeFreeBSD

witness(4): Print stack of prior observed lock order on reversal
ClosedPublic

Authored by cem on Aug 14 2020, 9:24 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, Jan 20, 5:29 AM
Unknown Object (File)
Wed, Jan 15, 8:47 AM
Unknown Object (File)
Dec 9 2024, 6:36 PM
Unknown Object (File)
Nov 26 2024, 11:48 PM
Unknown Object (File)
Nov 10 2024, 1:26 PM
Unknown Object (File)
Nov 7 2024, 5:19 AM
Unknown Object (File)
Nov 5 2024, 9:29 AM
Unknown Object (File)
Oct 16 2024, 11:51 PM
Subscribers

Details

Summary

The first time Witness observes a lock order between two locks, it records
the caller's stack. On detected reversal, print out that previous observed
stack. It is quite possible that the reported "LOR" is the correct
ordering, and the violation was the observed earlier ordering.

The debug.witness.badstacks sysctl already does something similar, but it would
be nice if the normal report included it.

Test Plan

As discussed on IRC earlier today.

$ reboot
...
lock order reversal:
 1st 0xfffff80003999830 ufs (ufs, lockmgr) @ /usr/home/conrad/src/freebsd/sys/kern/vfs_mount.c:1696
 2nd 0xfffff8000399c640 devfs (devfs, lockmgr) @ /usr/home/conrad/src/freebsd/sys/kern/vfs_subr.c:2892
lock order devfs -> ufs established at:
#0 0xffffffff80c2cbc0 at witness_checkorder+0x460
#1 0xffffffff80b925b6 at lockmgr_lock_flags+0x186
#2 0xffffffff80e6e681 at ffs_lock+0x71
#3 0xffffffff80cb3174 at _vn_lock+0x54
#4 0xffffffff80c934f4 at vfs_domount+0xdc4
#5 0xffffffff80c91cb5 at vfs_donmount+0x855
#6 0xffffffff80c95c27 at kernel_mount+0x57
#7 0xffffffff80c98231 at parse_mount+0x491
#8 0xffffffff80c96775 at vfs_mountroot+0x575
#9 0xffffffff80b54558 at start_init+0x28
#10 0xffffffff80b7db20 at fork_exit+0x80
#11 0xffffffff80fc868e at fork_trampoline+0xe
lock order ufs -> devfs attempted at:
#0 0xffffffff80c2d4f9 at witness_checkorder+0xd99
#1 0xffffffff80b93ff5 at lockmgr_xlock+0x55
#2 0xffffffff80cb3174 at _vn_lock+0x54
#3 0xffffffff80c9c46f at vget_finish+0x4f
#4 0xffffffff80a6842d at devfs_allocv+0xbd
#5 0xffffffff80a67bf3 at devfs_root+0x43
#6 0xffffffff80ca1918 at vfs_cache_root_fallback+0x118
#7 0xffffffff80c9dae8 at vflush+0x48
#8 0xffffffff80a67afe at devfs_unmount+0x3e
#9 0xffffffff80c945cd at dounmount+0x43d
#10 0xffffffff80c9f241 at vfs_unmountall+0xc1
#11 0xffffffff80c75f6a at bufshutdown+0x2ba
#12 0xffffffff80bc0c22 at kern_reboot+0x212
#13 0xffffffff80bc09b9 at sys_reboot+0x399
#14 0xffffffff80ff1c70 at amd64_syscall+0x140
#15 0xffffffff80fc7f5e at fast_syscall_common+0xf8

Diff Detail

Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 33016
Build 30404: arc lint + arc unit

Event Timeline

cem requested review of this revision.Aug 14 2020, 9:24 PM
cem created this revision.

I think the format it a little off, is there a problem to even it out for both reports? e.g, in the lines of

lock order reversal detected (ufs (ufs) vs devfs (devfs))
ufs -> devfs order established at:
....
[backtrace]
devfs -> ufs attempted at:
...
[backtrace]

sys/kern/subr_witness.c
1376

flipping the flag at the wrong time can cause pstackv to be unititalized. best would be to atomic_load int the flag before any of this

In D26070#578823, @mjg wrote:

I think the format it a little off, is there a problem to even it out for both reports?

It is possible, sure. Currently the second stack part is printed by witness_debugger(), which is shared with other callers that don't have a lock->lock relationship (e.g., duplicate lock without DUPOK, and sleep with non-sleepable lock held). I think to do this I'd extract the witness_kdb portion of witness_debugger out to a witness_enter_kdb() and invoke it directly from this location after printing the 2nd stack.

lock order reversal detected (ufs (ufs) vs devfs (devfs))
ufs -> devfs order established at:
....
[backtrace]
devfs -> ufs attempted at:

I like the "established at" / "attempted at" terminology better than what I have now. Thanks!

sys/kern/subr_witness.c
1376

Sure, that's a good suggestion. I'm not sure we need atomic_load, though. I thought FreeBSD atomic model widely assumed word-sized loads don't tear, and non-atomic access to the mostly-RO sysctl values are extremely common in subr_witness.

cem marked an inline comment as done.
  • Improve formatting
  • Prevent witness_trace TOCTOU race

I think the first line needs to be capitalized, or all lines need to be lower case, other than that lgtm

sys/kern/subr_witness.c
1376

the point is to prevent the compiler from reloading the var. it may be there are other factors here which de facto accomplish this, but there is no need to rely on them.

cem marked an inline comment as done.
  • lower case Lock order for consistency; witness output is entirely lowercase. we could revisit that, but outside the scope of this change, as it should change all the other strings to match.
  • use atomic_load
This revision is now accepted and ready to land.Aug 15 2020, 7:38 PM