Page MenuHomeFreeBSD

epair: Fix shutdown race
AbandonedPublic

Authored by kp on Jul 6 2019, 5:35 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, Apr 15, 12:03 PM
Unknown Object (File)
Fri, Mar 29, 3:35 PM
Unknown Object (File)
Feb 9 2024, 1:23 AM
Unknown Object (File)
Jan 28 2024, 2:11 PM
Unknown Object (File)
Jan 20 2024, 4:42 PM
Unknown Object (File)
Jan 11 2024, 10:53 AM
Unknown Object (File)
Jan 4 2024, 5:46 PM
Unknown Object (File)
Jan 4 2024, 5:41 PM

Details

Reviewers
bz
mmacy
Group Reviewers
network
Summary

When a vnet jail with an epair interface in it gets shut down, and the
epair interface is destroyed at the same time there's a window for a
race, where the epair interface is destroyed while it's being moved back
into its original interface.
That can result in us ending up in epair_qflush() with a NULL sc.

This is frequently triggered by the automated pf tests, and results in a
panic like:

Fatal trap 12: page fault while in kernel mode
cpuid = 7; apic id = 07
fault virtual address = 0x40
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff8364f13f
stack pointer = 0x28:0xfffffe0074be6710
frame pointer = 0x28:0xfffffe0074be6750
code segment = base 0x0, limit 0xfffff, type 0x1b

= DPL 0, pres 1, long 1, def32 0, gran 1

processor eflags = interrupt enabled, resume, IOPL = 0
current process = 0 (thread taskq)
trap number = 12
panic: page fault
cpuid = 7
time = 1562430646
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0074be63d0
vpanic() at vpanic+0x19d/frame 0xfffffe0074be6420
panic() at panic+0x43/frame 0xfffffe0074be6480
trap_fatal() at trap_fatal+0x39c/frame 0xfffffe0074be64e0
trap_pfault() at trap_pfault+0x62/frame 0xfffffe0074be6530
trap() at trap+0x2b4/frame 0xfffffe0074be6640
calltrap() at calltrap+0x8/frame 0xfffffe0074be6640

  • trap 0xc, rip = 0xffffffff8364f13f, rsp = 0xfffffe0074be6710, rbp = 0xfffffe0074be6750 ---

epair_qflush() at epair_qflush+0x13f/frame 0xfffffe0074be6750
if_down() at if_down+0x11d/frame 0xfffffe0074be6780
if_detach_internal() at if_detach_internal+0x744/frame 0xfffffe0074be6800
if_vmove() at if_vmove+0x59/frame 0xfffffe0074be6850
vnet_if_return() at vnet_if_return+0x6d/frame 0xfffffe0074be6870
vnet_destroy() at vnet_destroy+0x124/frame 0xfffffe0074be68a0
prison_deref() at prison_deref+0x29d/frame 0xfffffe0074be68e0
taskqueue_run_locked() at taskqueue_run_locked+0x10c/frame 0xfffffe0074be6940
taskqueue_thread_loop() at taskqueue_thread_loop+0x88/frame 0xfffffe0074be6970
fork_exit() at fork_exit+0x84/frame 0xfffffe0074be69b0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0074be69b0

Diff Detail

Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 25223
Build 23898: arc lint + arc unit

Event Timeline

What happens ifp->if_softc will become NULL just after this 'if (sc == NULL)' check? It looks like epair_remove_ifp_from_draining(ifp) also uses if_softc field.

Good point. While this does make things a lot better (as in, nearly perfectly reliable panic running the pf tests, to being able to loop the tests all night, when combined with D20869), I think you're right that the race is still there.

D20869 is similar, in that things are getting cleaned up while we're still using the ifp elsewhere.
I think both are a consequence of the epoch-ification of the stack, but it's not at all clear to me how this should be fixed. I'm going to cc mmacy on this one too, because he might have a better idea.

I think I now understand better how the problem happens. It's specific to epair.
It's indeed a race between if_vmove() moving the interface back into its original vnet, and the epair interface getting destroyed. Destruction of epairs is special, because we remove two interfaces at once.
See epair_clone_destroy(). There we if_detach() (through ether_ifdetach()) the other interface as well. If that interface has been moved out of the vnet by then the if_detach() will fail, but this does not return an error (if_detach_internal() does). So we end up freeing the ifp, while if_vmove() is reinserting it in another vnet, leading to the described panic.

I have no idea how to reliably fix this right now.