Page MenuHomeFreeBSD

add pfil hooks to iflib
ClosedPublic

Authored by gallatin on Mar 19 2019, 2:56 PM.

Details

Reviewers
shurd
marius
Group Reviewers
iflib
Commits
rS346632: iflib: Add pfil hooks
Summary

This adds pfil hooks to iflib.

The idea is to examine and potentially drop unwanted traffic as early in the receive as possible, before mbufs are allocated, and before anything is passed up the stack. This saves considerable CPU. On a Xeon X3430 4-core box from ~10 years ago, we see an increase in packet processing from 7.8Mpps to 11.4Mpps under an extreme UDP flood attack:

<10:38am>slug/gallatin:~>nstat -I ix0

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree
7.84   0.00   4.02   0.00  0      1   98.23     38  10601   5260  3.59
7.86   0.00   4.03   0.00  0      1   98.62     34  10354   5140  3.59
7.89   0.00   4.04   0.00  0      1   98.03     34  10431   5145  3.59

^C
<10:38am>slug/gallatin:~>sudo pfilctl link -i ipfw:default-link ix0
<10:38am>slug/gallatin:~>nstat -I ix0

InMpps OMpps  InGbs  OGbs err TCP Est %CPU syscalls csw     irq GBfree

11.40 0.00 5.84 0.00 0 1 94.88 47 21910 10913 3.59
11.39 0.00 5.83 0.00 0 1 93.50 34 20915 10412 3.59

The major changes here are to remove the unneeded abstraction where callers of rxd_frag_to_sd() get back a pointer to the mbuf ring, and are responsible for NULL'ing that mbuf themselves. Now this happens directly in rxd_frag_to_sd(), and it returns an mbuf. This allows us to use the decision (and potentially mbuf) returned by the pfil hooks. The driver can now recycle mbufs to avoid re-allocation when packets are dropped.

There is some unfortunate cleverness in assemble_segments() around handling the remainder of jumbo frames. I could not see a cleaner way to deal with this.

I've tested this with em (via qemu), and ix & ixl on real hardware. It has been a major performance win everywhere from a DOS perspective.

Test Plan

Point a UDP flood attack at a box, and block it using something like:

kldload ipfw
ipfw flush
ipfw add 1000 pass all from any to any
ipfw add 999 deny ip from 192.168.8.3 to any in
sysctl net.link.ether.ipfw=1
pfilctl unl -io ipfw:default inet
pfilctl unl -io ipfw:default6 inet6
pfilctl unl -io ipfw:default-link ethernet
pfilctl link -i ipfw:default-link ix0

Observe that as you link/unlink the ix0 and ethernet hooks, the performance changes and the box does not crash. Also ensure the box accepts traffic from a non-blacklisted IP.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

gallatin created this revision.Mar 19 2019, 2:56 PM
erj added a subscriber: erj.Mar 19 2019, 5:03 PM
erj added inline comments.
sys/net/iflib.c
2552 ↗(On Diff #55236)

You dropped an 'f' here

2599 ↗(On Diff #55236)

You should update this comment since you added new checks to the if statement below.

gallatin updated this revision to Diff 55244.Mar 19 2019, 6:27 PM

Fixed comments, as pointed out by ejg

gallatin marked 2 inline comments as done.Mar 19 2019, 6:28 PM
shurd accepted this revision.Mar 27 2019, 5:45 PM
shurd added inline comments.
sys/net/iflib.c
2557 ↗(On Diff #55244)

I'm not really crazy about "default is OK" here.

If pf_rv was a pfil_return_t, and this was "case PFIL_PASS:", the warning causing the build to fail would be better than relying on the MPASS() wouldn't it?

This revision is now accepted and ready to land.Mar 27 2019, 5:45 PM
marius added inline comments.Mar 27 2019, 6:54 PM
sys/net/iflib.c
62 ↗(On Diff #55244)

Is there a particular reason to not sort this header alphabetically between <net/mp_ring.h> and <net/vnet.h>?

2510 ↗(On Diff #55244)

The "cidx" and "flid" from above should be merged with into this line and the resulting line sorted alphabetically, getting the variable declaration of rxd_frag_to_sd() close to style(9).

2552 ↗(On Diff #55244)

There should be two spaces after the dot in this comment, too.

2585 ↗(On Diff #55244)

Similarly, "i", "flags" and "padlen" should be merged here.

2661 ↗(On Diff #55244)

This should be "false" now.

2819 ↗(On Diff #55244)

The "rx_mbuf_null" counter should just go now.

4473 ↗(On Diff #55244)

Superfluous empty line

4480 ↗(On Diff #55244)

Ditto

marius requested changes to this revision.Mar 27 2019, 6:56 PM
This revision now requires changes to proceed.Mar 27 2019, 6:56 PM
gallatin updated this revision to Diff 55536.Mar 28 2019, 2:26 PM
gallatin marked 8 inline comments as done.

Address review feedback from marius & shurd

gallatin added inline comments.Mar 28 2019, 2:26 PM
sys/net/iflib.c
2557 ↗(On Diff #55244)

I've moved it out of the default. However, I'd prefer to not touch pf_rv, as there are other reviews outstanding for other drivers.

2661 ↗(On Diff #55244)

nice catch

2819 ↗(On Diff #55244)

Thanks for saying that.

marius accepted this revision.Apr 10 2019, 1:32 PM

Looks good to me now

This revision is now accepted and ready to land.Apr 10 2019, 1:32 PM

shurd: I have been waiting for you to test with bnxt & jumbo frames, but I think this is approaching a reasonable timeout (almost 2 weeks since the last feedback). I plan to commit in the next day or so unless I hear from you.

This revision was automatically updated to reflect the committed changes.