RFC6864 and per-CPU IP ID counters
ClosedPublic
Actions

Authored by glebius on Mar 30 2015, 10:34 PM.

Details

Reviewers

jmg
rpaulo
cy
gnn
adrian

Group Reviewers

network

Commits

rS280971: o Use new function ip_fillid() in all places throughout the kernel,

Summary

The change generalizes all IP ID generation places throughout the
kernel to new ip_fillid() function. The function takes IP header
as argument. The function looks if RFC6864 support is turned on,
and if yes, and IP packet quialifies as atomic, then it assigns ID
to zero. If no, the functions either uses random ID, if the random
ID feature is enabled, or takes next ID from a per-CPU counter.

Apart from this functional change, some gratuity provided:

All IP ID stuff now lives in ip_id.c, including all data and all sysctl knobs. The only exported feature is ip_fillid().
Header cleanup of ip_id.c.

Test Plan

The basic functionality has already been tested. I also want to hear from
Emeric POUPON <emeric.poupon@stormshield.eu>, who have observed ID collisions
in the wild with previous ip_id++ code.

The change to ipfilter is a little bit not trivial, so response from
Cy Shubert <cy@FreeBSD.org> is appreciated.

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Skipped

Unit

Tests Skipped

Event Timeline

glebius updated this revision to Diff 4530.Mar 30 2015, 10:34 PM

glebius retitled this revision from to RFC6864 and per-CPU IP ID counters.

glebius updated this object.

glebius edited the test plan for this revision. (Show Details)

glebius added reviewers: network, jmg, gnn.

Herald added a subscriber: imp. · View Herald TranscriptMar 30 2015, 10:34 PM

glebius added a reviewer: cy.Mar 30 2015, 10:36 PM

glebius removed a subscriber: imp.

I have not reviewed the ipf code.

sys/netinet/ip_id.c
168 ↗	(On Diff #4530)	This is probably a XXX comment.
170 ↗	(On Diff #4530)	Missing newline before return.
244 ↗	(On Diff #4530)	How about making a const variable to store htons(IP_DF) ?

This revision is now accepted and ready to land.Mar 31 2015, 1:33 AM

adrian accepted this revision.Mar 31 2015, 1:37 AM

adrian added a reviewer: adrian.

ipftest will need to include ip_id.c to satisfy ip_fillid reference. I haven't tested but maybe something like this,

Index: Makefile

Makefile (revision 280869)

+++ Makefile (working copy)
@@ -7,6 +7,7 @@

		ip_pool.c ip_scan.c ip_sync.c ip_rules.c \
		ip_fil.c ip_log.c ippool_y.c ippool_l.c ipf_y.c \
		ipf_l.c ipnat_y.c ipnat_l.c md5.c radix_ipf.c ip_dstlist.c

+SRCS+= ip_id.c
MAN= ipftest.1

WARNS?= 0
@@ -19,7 +20,8 @@

XXX CFLAGS+= -DIPFILTER_SCAN

-.PATH: ${.CURDIR}/../../../sys/contrib/ipfilter/netinet
+.PATH: ${.CURDIR}/../../../sys/contrib/ipfilter/netinet \
+ ${.CURDIR}/../../../sys/netinet

GENHDRS= ipnat_l.h ipnat_y.h ippool_l.h ippool_y.h ipf_l.h ipf_y.h
DPSRCS+= ${GENHDRS}

sys/contrib/ipfilter/netinet/fil.c
83 ↗	(On Diff #4530)	add, #include <netinet/ip_var.h>

Fix ipftest(1) build.

This revision now requires review to proceed.Mar 31 2015, 6:06 AM

Simple and effective.

This revision is now accepted and ready to land.Mar 31 2015, 6:17 AM

We've had a long-standing problem with IP ID support, but I wonder if this is the best solution. My recollection is hazy, but presumably the best model would be to have IP ID sequences between pairs of local and remote IP addresses, rather than random or per-CPU incrementing IP IDs. The latter maximise the chances of a collision (thanks, birthday paradox -- now with a #CPU multiplier) whereas the former would maximise the interval between reuse. Which is to say: this might appear to make the problem better, but I'm not convinced it actually does.

Robert, tracking it based on IP pair tuples can be achieved only by turning FLOWTABLE (or anything alike) on by default. This doesn't sound like what we want to do now.

Emeric, who observed collisions with old ip_id++ code, reports that with per-CPU code he doesn't see them:

https://lists.freebsd.org/pipermail/svn-src-head/2015-March/070091.html

I would dare to speculate, that the per-CPU code is more safe than a single atomic counter for all CPUs,
despite the birthday paradox. My point is that in the network stack we are trying to make network flows
affine to certain RX/TX queues in NICs, and at the same we are trying to make networking threads affine
to CPUs. So, what happens (at least we try to achieve that!) that a certain CPU serves a certain set of
IP flows. And making the ID counter not global, we make it overflow less often, making collisions less
probable. Of course, it is all very speculative. :)

Closed by commit rS280971 (authored by @glebius).

In D2177#23, @glebius wrote:

Emeric, who observed collisions with old ip_id++ code, reports that with per-CPU code he doesn't see them:

https://lists.freebsd.org/pipermail/svn-src-head/2015-March/070091.html

I would dare to speculate, that the per-CPU code is more safe than a single atomic counter for all CPUs,
despite the birthday paradox. My point is that in the network stack we are trying to make network flows
affine to certain RX/TX queues in NICs, and at the same we are trying to make networking threads affine
to CPUs. So, what happens (at least we try to achieve that!) that a certain CPU serves a certain set of
IP flows. And making the ID counter not global, we make it overflow less often, making collisions less
probable. Of course, it is all very speculative. :)

It's only safe as long as the counters remain out of sync.. :) If they get back into sync for some reason, it could be very bad.

In D2177#22, @glebius wrote:

Robert, tracking it based on IP pair tuples can be achieved only by turning FLOWTABLE (or anything alike) on by default. This doesn't sound like what we want to do now.

We shouldn't need to turn on FLOWTABLE to enable this feature. We should be able to do some two tuple hash to pick which of the n buckets to use for the packet.