pie_callout_cleanup() and fqpie_callout_cleanup() both panic
while grabbing these locks that have been added here previously
out of caution but we don't even access V_dn_cfg where the lock
is located at. Refcounts inside these modules are static.
Details
- Reviewers
kp
Diff Detail
- Repository
- rS FreeBSD src repository - subversion
- Lint
Lint Passed - Unit
No Test Coverage - Build Status
Buildable 43335 Build 40223: arc lint + arc unit
Event Timeline
The issue was first reported here: https://forum.opnsense.org/index.php?topic=25541.msg122856#msg122856
I could reproduce on a hardware device with the OPNsense 22.1-BETA kernels on recent FreeBSD stable/13 state.
The panic happens quite quickly after configuring "pie" option on a pipe. It also happen when I boot with all the network cables unplugged. Relevant config:
# cat /usr/local/etc/ipfw.rules #====================================================================================== # flush ruleset #====================================================================================== flush #====================================================================================== # define dummynet pipes #====================================================================================== pipe 10000 config bw 541Mbit/s type wf2q+ pie #====================================================================================== # define dummynet queues #====================================================================================== #====================================================================================== # general purpose rules 1...1000 #====================================================================================== add 100 allow pfsync from any to any add 110 allow carp from any to any # layer 2: pass ARP add 120 pass layer2 mac-type arp,rarp # OPNsense requires for WPA add 130 pass layer2 mac-type 0x888e,0x88c7 # PPP Over Ethernet Session Stage/Discovery Stage add 140 pass layer2 mac-type 0x8863,0x8864 # layer 2: block anything else non-IP(v4/v6) add 150 deny layer2 not mac-type ip,ipv6 # allow traffic send from localhost add 200 skipto 60000 ipv6 from ::1 to any add 201 skipto 60000 ipv4 from 127.0.0.0/8 to any add 202 skipto 60000 ipv6 from any to ::1 add 203 skipto 60000 ipv4 from any to 127.0.0.0/8 #====================================================================================== # Allow traffic to this host #====================================================================================== #====================================================================================== # redirect non-authenticated clients to captive portal @ local port 8000 + zoneid #====================================================================================== #====================================================================================== # accept traffic from all interfaces not used by captive portal #====================================================================================== # let the responses from the captive portal web server back out add 6000 skipto 60000 tcp from any to any out # forward unauthorized traffic from captiveportal interfaces to block rule # send all the rest to the traffic shaper rules add 6199 skipto 60000 all from any to any #====================================================================================== # 30000 .... 49999 reserved for captive portal accounting rules #====================================================================================== #====================================================================================== # traffic shaping section, authorized traffic #====================================================================================== add 60000 return via any add 60001 pipe 10000 ip from any to any src-port any dst-port any via igb0 // a2b84152-dbd8-4448-861c-225643a23503 lan: Bufferbloat # pass authorized add 65533 pass ip from any to any # block all unmatched add 65534 deny all from any to any
The backtrace listed in that forum thread is incomplete (that is, it's only a backtrace and doesn't include the rest of the panic information), so I'm speculating here, but isn't the problem that we're entering those callouts without a vnet context set? So we panic when we try to access the vnet variable V_dn_cfg (through the DN_HW_WLOCK macro).
Arguably a better fix would be to ensure that we carry the vnet information somewhere so we can set the correct vnet. In this case it looks like we might be able to get away with not taking these locks (mostly because it looks like the locking around e.g. the refcounting is already wrong anyway), but it's hardly ideal.
I was thinking the same at first but the locking introduced in https://cgit.freebsd.org/src/commit/sys/netpfil/ipfw/dn_aqm_pie.c?id=12be18c7d594 looks arbitrary and isn't anywhere else in those two files. It was added to "protect" the ref_count manipulation, but if you look at the other ref_count modification in that file these are also done without (obvious) locks.
Maybe these ref_count modifications should receive atomic updates without locks to avoid the locking overhead completely?
Perhaps, yes.
Although it looks like the ref_count is only read in unload_dn_aqm(), under the sched_mtx lock. That lock lives only in ip_dummynet.c, so I wonder if we shouldn't just move the updating of the reference count to dn_aqm_ref()/dn_aqm_unref() and protect it with the sched_mtx lock. That doesn't need vnet, so we don't have to worry about setting the context (because it's about a global setting, so using a vnet-ed lock is wrong anyway) and we actually clean the locking up a little.
Happy new year. I tried to follow your reasoning but I only see NET_EPOCH use for these refcounts. For me it would also be ok to introduce the correct vnet for locking as you originally suggested.