Page MenuHomeFreeBSD

iflib: call ether_ifdetach and netmap_detach before stop
ClosedPublic

Authored by jacob.e.keller_intel.com on Oct 17 2019, 11:47 PM.

Details

Summary

Calling ether_ifdetach after iflib_stop leads to a potential race where
a stale ifp pointer can remain in the route entry list for IPv6 traffic.
This will potentially cause a page fault or other system instability if
the ifp pointer is accessed.

Fatal trap 12: page fault while in kernel mode
cpuid = 36; apic id = 52
fault virtual address   = 0x10
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80c84707
stack pointer           = 0x28:0xfffffe2f434b7220
frame pointer           = 0x28:0xfffffe2f434b7270
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (if_io_tqg_36)
trap number             = 12
panic: page fault
cpuid = 36
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe2f434b6ee0
vpanic() at vpanic+0x17e/frame 0xfffffe2f434b6f40
panic() at panic+0x43/frame 0xfffffe2f434b6fa0
trap_fatal() at trap_fatal+0x369/frame 0xfffffe2f434b6ff0
trap_pfault() at trap_pfault+0x62/frame 0xfffffe2f434b7040
trap() at trap+0x2b3/frame 0xfffffe2f434b7150
calltrap() at calltrap+0x8/frame 0xfffffe2f434b7150
--- trap 0xc, rip = 0xffffffff80c84707, rsp = 0xfffffe2f434b7220, rbp = 0xfffffe2f434b7270 ---
in6_selecthlim() at in6_selecthlim+0x27/frame 0xfffffe2f434b7270
sctp_lowlevel_chunk_output() at sctp_lowlevel_chunk_output+0xeb2/frame 0xfffffe2f434b73b0
sctp_chunk_output() at sctp_chunk_output+0x68c/frame 0xfffffe2f434b7d30
sctp_common_input_processing() at sctp_common_input_processing+0xa2a/frame 0xfffffe2f434b7e40
sctp6_input_with_port() at sctp6_input_with_port+0x1e5/frame 0xfffffe2f434b7f20
sctp6_input() at sctp6_input+0xb/frame 0xfffffe2f434b7f30
ip6_input() at ip6_input+0xc23/frame 0xfffffe2f434b8020
netisr_dispatch_src() at netisr_dispatch_src+0x82/frame 0xfffffe2f434b8080
ether_demux() at ether_demux+0x14f/frame 0xfffffe2f434b80b0
ether_nh_input() at ether_nh_input+0x330/frame 0xfffffe2f434b80f0
netisr_dispatch_src() at netisr_dispatch_src+0x82/frame 0xfffffe2f434b8150
ether_input() at ether_input+0x62/frame 0xfffffe2f434b8180
_task_fn_rx() at _task_fn_rx+0xd33/frame 0xfffffe2f434b8270
gtaskqueue_run_locked() at gtaskqueue_run_locked+0x139/frame 0xfffffe2f434b82c0
gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0x88/frame 0xfffffe2f434b82f0
fork_exit() at fork_exit+0x84/frame 0xfffffe2f434b8330
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe2f434b8330
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic

Move both iflib_netmap_detach and ether_ifdetach to be called prior to
iflib_stop. This avoids the race above, and helps ensure that other ifp
references are removed before stopping the interface.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

I did manage to eventually trigger another stale ifp pointer, though it was not reliable:

Kernel page fault with the following non-sleepable locks held:
exclusive sleep mutex ip6qlock (ip6qlock) r = 0 (0xfffffe00007aa848) locked @ /usr/src/sys/netinet6/frag6.c:849
shared rw vnet_rwlock (vnet_rwlock) r = 0 (0xffffffff820be700) locked @ /usr/src/sys/netinet6/frag6.c:845
stack backtrace:
#0 0xffffffff80bb6f83 at witness_debugger+0x73
#1 0xffffffff80bb7fa2 at witness_warn+0x442
#2 0xffffffff8108a0f3 at trap_pfault+0x53
#3 0xffffffff810896e4 at trap+0x2b4
#4 0xffffffff8106201c at calltrap+0x8
#5 0xffffffff80d8c07a at icmp6_error+0x4aa
#6 0xffffffff80d8b30e at frag6_freef+0x10e
#7 0xffffffff80d8b551 at frag6_slowtimo+0x111
#8 0xffffffff80bdcda4 at pfslowtimo+0x54
#9 0xffffffff80b65bdf at softclock_call_cc+0x13f
#10 0xffffffff80b65f9c at softclock+0x7c
#11 0xffffffff80b0f857 at ithread_loop+0x187
#12 0xffffffff80b0c4a4 at fork_exit+0x84
#13 0xffffffff8106305e at fork_trampoline+0xe


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0xfffffe0000825dd8
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80d8c5b2
stack pointer           = 0x28:0xfffffe1fc28c6ff0
frame pointer           = 0x28:0xfffffe1fc28c7090
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 12 (swi4: clock (0))
trap number             = 12
panic: page fault
cpuid = 0
time = 1571354026
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe1fc28c6cb0
vpanic() at vpanic+0x19d/frame 0xfffffe1fc28c6d00
panic() at panic+0x43/frame 0xfffffe1fc28c6d60
trap_fatal() at trap_fatal+0x39c/frame 0xfffffe1fc28c6dc0
trap_pfault() at trap_pfault+0x62/frame 0xfffffe1fc28c6e10
trap() at trap+0x2b4/frame 0xfffffe1fc28c6f20
calltrap() at calltrap+0x8/frame 0xfffffe1fc28c6f20
--- trap 0xc, rip = 0xffffffff80d8c5b2, rsp = 0xfffffe1fc28c6ff0, rbp = 0xfffffe1fc28c7090 ---
icmp6_reflect() at icmp6_reflect+0x242/frame 0xfffffe1fc28c7090
icmp6_error() at icmp6_error+0x4aa/frame 0xfffffe1fc28c70e0
frag6_freef() at frag6_freef+0x10e/frame 0xfffffe1fc28c7130
frag6_slowtimo() at frag6_slowtimo+0x111/frame 0xfffffe1fc28c7180
pfslowtimo() at pfslowtimo+0x54/frame 0xfffffe1fc28c71b0
softclock_call_cc() at softclock_call_cc+0x13f/frame 0xfffffe1fc28c7260
softclock() at softclock+0x7c/frame 0xfffffe1fc28c7290
ithread_loop() at ithread_loop+0x187/frame 0xfffffe1fc28c72f0
fork_exit() at fork_exit+0x84/frame 0xfffffe1fc28c7330
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe1fc28c7330
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic

Assuming this fix makes sense, this should be MFC'd to 11-STABLE and 12-STABLE too, IMHO.

sys/net/iflib.c
5386–5394 ↗(On Diff #63424)

These also supposedly need to be called prior to iflib_stop as well?

I did manage to eventually trigger another stale ifp pointer, though it was not reliable:

Kernel page fault with the following non-sleepable locks held:
exclusive sleep mutex ip6qlock (ip6qlock) r = 0 (0xfffffe00007aa848) locked @ /usr/src/sys/netinet6/frag6.c:849
shared rw vnet_rwlock (vnet_rwlock) r = 0 (0xffffffff820be700) locked @ /usr/src/sys/netinet6/frag6.c:845
stack backtrace:
#0 0xffffffff80bb6f83 at witness_debugger+0x73
#1 0xffffffff80bb7fa2 at witness_warn+0x442
#2 0xffffffff8108a0f3 at trap_pfault+0x53
#3 0xffffffff810896e4 at trap+0x2b4
#4 0xffffffff8106201c at calltrap+0x8
#5 0xffffffff80d8c07a at icmp6_error+0x4aa
#6 0xffffffff80d8b30e at frag6_freef+0x10e
#7 0xffffffff80d8b551 at frag6_slowtimo+0x111
#8 0xffffffff80bdcda4 at pfslowtimo+0x54
#9 0xffffffff80b65bdf at softclock_call_cc+0x13f
#10 0xffffffff80b65f9c at softclock+0x7c
#11 0xffffffff80b0f857 at ithread_loop+0x187
#12 0xffffffff80b0c4a4 at fork_exit+0x84
#13 0xffffffff8106305e at fork_trampoline+0xe


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0xfffffe0000825dd8
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80d8c5b2
stack pointer           = 0x28:0xfffffe1fc28c6ff0
frame pointer           = 0x28:0xfffffe1fc28c7090
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 12 (swi4: clock (0))
trap number             = 12
panic: page fault
cpuid = 0
time = 1571354026
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe1fc28c6cb0
vpanic() at vpanic+0x19d/frame 0xfffffe1fc28c6d00
panic() at panic+0x43/frame 0xfffffe1fc28c6d60
trap_fatal() at trap_fatal+0x39c/frame 0xfffffe1fc28c6dc0
trap_pfault() at trap_pfault+0x62/frame 0xfffffe1fc28c6e10
trap() at trap+0x2b4/frame 0xfffffe1fc28c6f20
calltrap() at calltrap+0x8/frame 0xfffffe1fc28c6f20
--- trap 0xc, rip = 0xffffffff80d8c5b2, rsp = 0xfffffe1fc28c6ff0, rbp = 0xfffffe1fc28c7090 ---
icmp6_reflect() at icmp6_reflect+0x242/frame 0xfffffe1fc28c7090
icmp6_error() at icmp6_error+0x4aa/frame 0xfffffe1fc28c70e0
frag6_freef() at frag6_freef+0x10e/frame 0xfffffe1fc28c7130
frag6_slowtimo() at frag6_slowtimo+0x111/frame 0xfffffe1fc28c7180
pfslowtimo() at pfslowtimo+0x54/frame 0xfffffe1fc28c71b0
softclock_call_cc() at softclock_call_cc+0x13f/frame 0xfffffe1fc28c7260
softclock() at softclock+0x7c/frame 0xfffffe1fc28c7290
ithread_loop() at ithread_loop+0x187/frame 0xfffffe1fc28c72f0
fork_exit() at fork_exit+0x84/frame 0xfffffe1fc28c7330
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe1fc28c7330
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic

I'm fairly convinced this is a separate panic caused by the rcvif pointer in a receive mbuf.

This looks good to me from my understanding.

sys/net/iflib.c
5386–5394 ↗(On Diff #63424)

Ideally, yes.

sys/net/iflib.c
5386–5394 ↗(On Diff #63424)

I'm going to update the review for that as well, then.

Updated the review to add vlan event handler unregister calls before iflib_stop as well. I kept the call in iflib_deregister as well so that they get cleaned up properly during error flows. Now it uses the "unregister -> assign NULL" pattern so that only one of the flows will actually perform an unregister check.

I considered switching the check in iflib_deregister to a KASSERT, but honestly think just checking and freeing there is easier and safer.

This revision is now accepted and ready to land.Oct 23 2019, 6:10 PM