Page MenuHomeFreeBSD

iflib: initialize LRO unconditionally
ClosedPublic

Authored by gallatin on Apr 22 2021, 3:46 PM.

Details

Summary

Changes to the LRO code have exposed a bug in iflib where devices which are not capable of doing LRO are still calling tcp_lro_flush_all(), even when they have not initialized the LRO context. This used to be mostly harmless, but the LRO code now sets the VNET based on the ifp in the lro context and will try to access it through a NULL ifp resulting in a panic at boot.

This patch unconditionally initializes LRO so that we have a valid LRO context when calling tcp_lro_flush_all(). One alternative is to check the device capabilities before calling tcp_lro_flush_all() or adding a new state flag in the ctx. However, it seems unwise to add an extra, mostly useless test for higher performance devices when we can just initialize LRO for all devices.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

This fixed my following panic:

Starting dhclient.
DHCPDISCOVER on em0 to 255.255.255.255 port 67 interval 8


Fatal trap 12: page fault while in kernel mode
cpuid = 7; apic id = 07
fault virtual address   = 0x378
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff80e2475b
stack pointer           = 0x28:0xfffffe000ed97220
frame pointer           = 0x28:0xfffffe000ed97270
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (if_io_tqg_7)
trap number             = 12
panic: page fault
cpuid = 7
time = 1619099162
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe000ed96ec0
vpanic() at vpanic+0x181/frame 0xfffffe000ed96f10
panic() at panic+0x43/frame 0xfffffe000ed96f70
trap_fatal() at trap_fatal+0x387/frame 0xfffffe000ed96fd0
trap_pfault() at trap_pfault+0x4f/frame 0xfffffe000ed97030
trap() at trap+0x27d/frame 0xfffffe000ed97150
calltrap() at calltrap+0x8/frame 0xfffffe000ed97150
--- trap 0xc, rip = 0xffffffff80e2475b, rsp = 0xfffffe000ed97220, rbp = 0xfffffe000ed97270 ---
tcp_lro_flush_all() at tcp_lro_flush_all+0x2b/frame 0xfffffe000ed97270
iflib_rxeof() at iflib_rxeof+0xcb3/frame 0xfffffe000ed97380
_task_fn_rx() at _task_fn_rx+0x72/frame 0xfffffe000ed973c0
gtaskqueue_run_locked() at gtaskqueue_run_locked+0x15d/frame 0xfffffe000ed97440

It is also possible to add a check before calling into LRO functions like:

https://reviews.freebsd.org/D29900

--HPS

It is also possible to add a check before calling into LRO functions like:

https://reviews.freebsd.org/D29900

--HPS

Yes, I mentioned this. But I'd rather avoid doing a test for 10/25/100g devices on every EOI..

We could also set the VNET after the mbuf_count == 0 check, but I agree, we should not be calling on uninitialized objects.

void
tcp_lro_flush_all(struct lro_ctrl *lc)
{
        uint64_t seq;
        uint64_t nseq;
        unsigned x;

        CURVNET_SET(lc->ifp->if_vnet);

        /* check if no mbufs to flush */
        if (lc->lro_mbuf_count == 0)
                goto done;

In your patch you also need to remove tcp_lro_free() from under the IFCAP_LRO!

if (if_getcapabilities(ctx->ifc_ifp) & IFCAP_LRO)
        tcp_lro_free(&rxq->ifr_lc);

Now that we unconditionally allocate LRO resources, we must also unconditionally free them, as pointed out by Hans

In your patch you also need to remove tcp_lro_free() from under the IFCAP_LRO!

if (if_getcapabilities(ctx->ifc_ifp) & IFCAP_LRO)
        tcp_lro_free(&rxq->ifr_lc);

Done, thanks!

markj added inline comments.
sys/net/iflib.c
5938

It's possible, I think, that while handling an initialization error of some sort we'll call tcp_lro_free() on an uninitialized ifr_lc. That seems ok, tcp_lro_free() is idempotent and rxqs are zeroed at allocation time. It feels a bit fragile to rely on this, but I don't have any suggestions other than to add a is-lro-initialized flag to each rxq.

This revision was not accepted when it landed; it landed in state Needs Review.Apr 23 2021, 9:58 AM
This revision was automatically updated to reflect the committed changes.