Page MenuHomeFreeBSD

iflib: don't pullup UDP payloads to the TCP header size
ClosedPublic

Authored by gallatin on Aug 5 2025, 6:05 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, Nov 10, 4:34 AM
Unknown Object (File)
Sun, Nov 2, 9:19 AM
Unknown Object (File)
Mon, Oct 27, 10:19 PM
Unknown Object (File)
Mon, Oct 27, 6:48 AM
Unknown Object (File)
Mon, Oct 20, 7:06 AM
Unknown Object (File)
Sat, Oct 18, 9:32 AM
Unknown Object (File)
Oct 12 2025, 4:31 PM
Unknown Object (File)
Oct 12 2025, 6:10 AM

Details

Summary

The IPv4 packet parsing logic in iflib is incredibly complex, prematurely optimized, and believes all the world is TCP. This causes it to pullup part of the UDP payload into the packet header, causing unneeded memory copies. This impacts a project I'm working on, and it also impacts NFS. Eg, NFS over UDP will result in pullups for every datagram sent over an iflib NIC in this path:

m_pullup
            kernel`iflib_encap+0x980
            kernel`iflib_txq_drain+0x246
            kernel`drain_ring_lockless+0x5e
            kernel`ifmp_ring_enqueue+0x274
            kernel`iflib_if_transmit+0x247
            kernel`ether_output_frame+0xab
            kernel`ether_output+0x6b0
            kernel`ip_output_send+0xd6
            kernel`ip_output+0x124d
            kernel`udp_send+0x98c
            kernel`sosend_dgram+0x2fc
            kernel`svc_dg_reply+0xa2
            kernel`svc_sendreply_common+0x97
            kernel`svc_sendreply_mbuf+0x51
            kernel`nfssvc_program+0x93f
            kernel`svc_run_internal+0xbd0
            kernel`svc_thread_start+0xb
            kernel`fork_exit+0x7b
            kernel`0xffffffff8104152e

This patch:

  • adds parsing for UDP to iflib
  • attempts to pull up the correct header size, based on UDP or TCP protocol type.
  • simplifies packet parsing in iflib by
    • no longer special casing having an ethernet header in a packet by itself
    • no longer checking that we're trying to pullup something beyond the end of the packet. Since we're no longer trying to pull up a larger TCP header, attempting to pullup something larger than the packet should no longer happen. If it does, the packet is malformed and m_pullup will return an error when it runs out of data in the mbuf chain
Test Plan

Apply patch to NFS server running NFS over an iflib'ized NIC driver
Mount an NFS filesystem via UDP from server running patched
Verify using this dtrace script that m_pullup is no longer called in the NFS datagram send path
dtrace -n 'fbt::m_pullup:entry{@[probefunc, stack()] = count();}'

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

FreeBSD/sys/net/iflib.c
3379–3380 ↗(On Diff #159797)

Both ip and th are initialized before use, no need to initialize them as NULL.

3383 ↗(On Diff #159797)

Let's make hlen uint8_t to match signedness and size of ipi_ehdrlen.

3397 ↗(On Diff #159797)

Let's use char * instead of caddr_t in any new code.

It also make this block of code more readable, thanks!

Update diff to address Gleb's feedback

gallatin added inline comments.
FreeBSD/sys/net/iflib.c
3397 ↗(On Diff #159797)

Well, it was copied from the old code. But I agree and changed it.

This all seems sensible to me.

I wonder if they had originally thought about the extra pullups that could've been performed on UDP packets -- did they really just assume UDP performance wasn't that much of a concern? But since it took so long for someone to fix this, maybe they were right for the most part. :p

FreeBSD/sys/net/iflib.c
3396 ↗(On Diff #159800)

style(9) here for the space between sizeof and parentheses, as well as on line 3399 below

gallatin marked an inline comment as done.

Address Eric's style(9) nits and remove space in front of parans in sizeof

kbowling added a subscriber: kbowling.
kbowling added inline comments.
FreeBSD/sys/net/iflib.c
3386 ↗(On Diff #159809)

this comment can probably be retired.

This revision is now accepted and ready to land.Aug 5 2025, 8:06 PM