Page MenuHomeFreeBSD

witness: mark relationships defined in order_lists[]
ClosedPublic

Authored by jtl on Tue, Jan 27, 2:21 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Jan 31, 4:51 AM
Unknown Object (File)
Thu, Jan 29, 8:56 AM
Unknown Object (File)
Wed, Jan 28, 7:28 AM
Unknown Object (File)
Wed, Jan 28, 1:44 AM
Unknown Object (File)
Wed, Jan 28, 1:11 AM
Unknown Object (File)
Wed, Jan 28, 12:36 AM
Unknown Object (File)
Tue, Jan 27, 11:21 PM
Unknown Object (File)
Tue, Jan 27, 4:52 PM
Subscribers

Details

Summary
Mark the lock relationships defined in order_lists[] and use that
information to print better messages when a LOR violates one of these
relationships.

Suggested by:   markj
Reviewed by:    <If someone else reviewed your modification.>
MFC after:      1 month
Sponsored by:   Netflix
Differential Revision:  <https://reviews.freebsd.org/D###>
Test Plan

I generated a synthetic LOR that hits a hard-coded relationship. The code now prints this:

lock order reversal:
 1st 0xffff0000c27da5b0 lle (lle, rw) @ /home/jtl/src/sys/kern/subr_witness.c:3520
 2nd 0xffff0000c27da590 so_rcv (so_rcv, sleep mutex) @ /home/jtl/src/sys/kern/subr_witness.c:3521
All lock orders from so_rcv -> lle:
"so_rcv" -> "tcphash" -> "in6_ifaddr_lock" -> "lle"
"so_rcv" -> "udphash" -> "in6_ifaddr_lock" -> "lle"
"so_rcv" -> "udphash" -> "lle"

lock order "so_rcv"(sleep mutex) -> "tcphash"(sleep mutex) first seen at:
#0 0xffff00000054ce50 at witness_checkorder+0x32c
#1 0xffff0000004ad144 at __mtx_lock_flags+0x9c
#2 0xffff0000006b6d24 at tcp6_usr_listen+0x188
#3 0xffff00000058cf34 at solisten+0x48
#4 0xffff0000005956fc at kern_listen+0x74
#5 0xffff0000008be31c at do_el0_sync+0x68c
#6 0xffff0000008911ac at handle_el0_sync+0x4c

hardcoded lock order "tcphash"(sleep mutex) -> "in6_ifaddr_lock"(rm) first seen at:
(No stack trace)

lock order "so_rcv"(sleep mutex) -> "udphash"(sleep mutex) first seen at:
#0 0xffff00000054ce50 at witness_checkorder+0x32c
#1 0xffff0000004ad144 at __mtx_lock_flags+0x9c
#2 0xffff000000551858 at sysctl_debug_witness_generate_lor+0xc4
#3 0xffff0000004e7158 at sysctl_root_handler_locked+0xe4
#4 0xffff0000004e64a4 at sysctl_root+0x1e4
#5 0xffff0000004e6b3c at userland_sysctl+0x140
#6 0xffff0000004e69c0 at sys___sysctl+0x84
#7 0xffff0000008be31c at do_el0_sync+0x68c
#8 0xffff0000008911ac at handle_el0_sync+0x4c

lock order "udphash"(sleep mutex) -> "in6_ifaddr_lock"(rm) first seen at:
#0 0xffff00000054ce50 at witness_checkorder+0x32c
#1 0xffff0000004ce06c at _rm_wlock_debug+0x78
#2 0xffff0000005518f4 at sysctl_debug_witness_generate_lor+0x160
#3 0xffff0000004e7158 at sysctl_root_handler_locked+0xe4
#4 0xffff0000004e64a4 at sysctl_root+0x1e4
#5 0xffff0000004e6b3c at userland_sysctl+0x140
#6 0xffff0000004e69c0 at sys___sysctl+0x84
#7 0xffff0000008be31c at do_el0_sync+0x68c
#8 0xffff0000008911ac at handle_el0_sync+0x4c

lock order "in6_ifaddr_lock"(rm) -> "lle"(rw) first seen at:
#0 0xffff00000054ce50 at witness_checkorder+0x32c
#1 0xffff0000004d0980 at __rw_rlock_int+0x8c
#2 0xffff0000006cb804 at in6_lltable_lookup+0xe0
#3 0xffff0000006ead44 at nd6_lookup+0x74
#4 0xffff0000006f569c at find_pfxlist_reachable_router+0x88
#5 0xffff0000006f30fc at pfxlist_onlink_check+0x4b4
#6 0xffff0000006f29dc at nd6_ra_input+0x1198
#7 0xffff0000006c30c8 at icmp6_input+0x8c8
#8 0xffff0000006dd660 at ip6_input+0xec0
#9 0xffff00000063c104 at netisr_dispatch_src+0xd8
#10 0xffff00000061d44c at ether_demux+0x174
#11 0xffff00000061e9a0 at ether_nh_input+0x374
#12 0xffff00000063c104 at netisr_dispatch_src+0xd8
#13 0xffff00000061d894 at ether_input+0xdc
#14 0xffff0000006981e8 at tcp_lro_flush_all+0xec
#15 0xffff000000637700 at iflib_rxeof+0xbc4
#16 0xffff000000631990 at _task_fn_rx+0xa4
#17 0xffff0000005237f4 at gtaskqueue_run_locked+0x164

lock order "udphash"(sleep mutex) -> "lle"(rw) first seen at:
#0 0xffff00000054ce50 at witness_checkorder+0x32c
#1 0xffff0000004d0980 at __rw_rlock_int+0x8c
#2 0xffff0000005518a4 at sysctl_debug_witness_generate_lor+0x110
#3 0xffff0000004e7158 at sysctl_root_handler_locked+0xe4
#4 0xffff0000004e64a4 at sysctl_root+0x1e4
#5 0xffff0000004e6b3c at userland_sysctl+0x140
#6 0xffff0000004e69c0 at sys___sysctl+0x84
#7 0xffff0000008be31c at do_el0_sync+0x68c
#8 0xffff0000008911ac at handle_el0_sync+0x4c

lock order lle -> so_rcv attempted at:
#0 0xffff00000054d694 at witness_checkorder+0xb70
#1 0xffff0000004ad144 at __mtx_lock_flags+0x9c
#2 0xffff00000055193c at sysctl_debug_witness_generate_lor+0x1a8
#3 0xffff0000004e7158 at sysctl_root_handler_locked+0xe4
#4 0xffff0000004e64a4 at sysctl_root+0x1e4
#5 0xffff0000004e6b3c at userland_sysctl+0x140
#6 0xffff0000004e69c0 at sys___sysctl+0x84
#7 0xffff0000008be31c at do_el0_sync+0x68c
#8 0xffff0000008911ac at handle_el0_sync+0x4c

and this:

lock order reversal:
 1st 0xffff0000c29155b0 lle (lle, rw) @ /home/jtl/src/sys/kern/subr_witness.c:3520
 2nd 0xffff0000c2915590 so_rcv (so_rcv, sleep mutex) @ /home/jtl/src/sys/kern/subr_witness.c:3521
All lock orders from so_rcv -> lle:
"so_rcv" -> "tcphash" -> "lle"
"so_rcv" -> "tcphash" -> "in6_ifaddr_lock" -> "lle"

lock order "so_rcv"(sleep mutex) -> "tcphash"(sleep mutex) first seen at:
#0 0xffff00000054ce50 at witness_checkorder+0x32c
#1 0xffff0000004ad144 at __mtx_lock_flags+0x9c
#2 0xffff0000006b6d24 at tcp6_usr_listen+0x188
#3 0xffff00000058cf34 at solisten+0x48
#4 0xffff0000005956fc at kern_listen+0x74
#5 0xffff0000008be31c at do_el0_sync+0x68c
#6 0xffff0000008911ac at handle_el0_sync+0x4c

lock order "tcphash"(sleep mutex) -> "lle"(rw) first seen at:
#0 0xffff00000054ce50 at witness_checkorder+0x32c
#1 0xffff0000004d0980 at __rw_rlock_int+0x8c
#2 0xffff0000005518a4 at sysctl_debug_witness_generate_lor+0x110
#3 0xffff0000004e7158 at sysctl_root_handler_locked+0xe4
#4 0xffff0000004e64a4 at sysctl_root+0x1e4
#5 0xffff0000004e6b3c at userland_sysctl+0x140
#6 0xffff0000004e69c0 at sys___sysctl+0x84
#7 0xffff0000008be31c at do_el0_sync+0x68c
#8 0xffff0000008911ac at handle_el0_sync+0x4c

hardcoded lock order "tcphash"(sleep mutex) -> "in6_ifaddr_lock"(rm) first seen at:
#0 0xffff00000054ce50 at witness_checkorder+0x32c
#1 0xffff0000004ce06c at _rm_wlock_debug+0x78
#2 0xffff0000005518f4 at sysctl_debug_witness_generate_lor+0x160
#3 0xffff0000004e7158 at sysctl_root_handler_locked+0xe4
#4 0xffff0000004e64a4 at sysctl_root+0x1e4
#5 0xffff0000004e6b3c at userland_sysctl+0x140
#6 0xffff0000004e69c0 at sys___sysctl+0x84
#7 0xffff0000008be31c at do_el0_sync+0x68c
#8 0xffff0000008911ac at handle_el0_sync+0x4c

lock order "in6_ifaddr_lock"(rm) -> "lle"(rw) first seen at:
#0 0xffff00000054ce50 at witness_checkorder+0x32c
#1 0xffff0000004d0980 at __rw_rlock_int+0x8c
#2 0xffff0000006cb804 at in6_lltable_lookup+0xe0
#3 0xffff0000006ead44 at nd6_lookup+0x74
#4 0xffff0000006f569c at find_pfxlist_reachable_router+0x88
#5 0xffff0000006f30fc at pfxlist_onlink_check+0x4b4
#6 0xffff0000006f29dc at nd6_ra_input+0x1198
#7 0xffff0000006c30c8 at icmp6_input+0x8c8
#8 0xffff0000006dd660 at ip6_input+0xec0
#9 0xffff00000063c104 at netisr_dispatch_src+0xd8
#10 0xffff00000061d44c at ether_demux+0x174
#11 0xffff00000061e9a0 at ether_nh_input+0x374
#12 0xffff00000063c104 at netisr_dispatch_src+0xd8
#13 0xffff00000061d894 at ether_input+0xdc
#14 0xffff0000006981e8 at tcp_lro_flush_all+0xec
#15 0xffff000000637700 at iflib_rxeof+0xbc4
#16 0xffff000000631990 at _task_fn_rx+0xa4
#17 0xffff0000005237f4 at gtaskqueue_run_locked+0x164

lock order lle -> so_rcv attempted at:
#0 0xffff00000054d694 at witness_checkorder+0xb70
#1 0xffff0000004ad144 at __mtx_lock_flags+0x9c
#2 0xffff00000055193c at sysctl_debug_witness_generate_lor+0x1a8
#3 0xffff0000004e7158 at sysctl_root_handler_locked+0xe4
#4 0xffff0000004e64a4 at sysctl_root+0x1e4
#5 0xffff0000004e6b3c at userland_sysctl+0x140
#6 0xffff0000004e69c0 at sys___sysctl+0x84
#7 0xffff0000008be31c at do_el0_sync+0x68c
#8 0xffff0000008911ac at handle_el0_sync+0x4c

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

jtl requested review of this revision.Tue, Jan 27, 2:21 AM
sys/kern/subr_witness.c
1314

Should the explicit ordering take precedence over the recorded order?
Or we do not record stack if the pair is listed?

If we do record stack for listed pairs, it might make sense to print both, i.e. the backtrace and the note.

sys/kern/subr_witness.c
1314

We do record both. In the patch you reviewed, the stack trace always takes precedence. It is trivial to add both sets of information. Let me do that quickly.

jtl edited the test plan for this revision. (Show Details)

Updated to display both whether a lock was dynamically discovered or hardcoded, and also the stack trace of the first recorded usage of that lock ordering.

markj added inline comments.
sys/kern/subr_witness.c
1308

It feels a bit weird to refer to non-hardcoded lock orders as "dynamic," I'd just print "hardcoded" or nothing at all.

1315

This can be one line now.

This revision is now accepted and ready to land.Tue, Jan 27, 2:09 PM
jtl edited the test plan for this revision. (Show Details)

Addressed review feedback from @markj.

This revision now requires review to proceed.Tue, Jan 27, 5:02 PM
jtl marked 3 inline comments as done.Tue, Jan 27, 5:04 PM
jtl added inline comments.
sys/kern/subr_witness.c
1308

I thought hardcoded felt odd, as well. :-) But, I needed to put some words in here. I made the change you suggested. In order to make sure the word "lock" wasn't obfuscated in the code or in log messages, I changed the messages to start with a lower-case letter. I can, of course, change that.

markj added inline comments.
sys/kern/subr_witness.c
1308

I think this is fine, thanks for humouring me.

This revision is now accepted and ready to land.Tue, Jan 27, 5:07 PM
This revision was automatically updated to reflect the committed changes.
jtl marked an inline comment as done.