Page MenuHomeFreeBSD

Rewrite of TCP Reassembly code
ClosedPublic

Authored by rrs on Aug 8 2018, 2:25 PM.
Tags
None
Referenced Files
F103009157: D16626.diff
Tue, Nov 19, 6:52 PM
Unknown Object (File)
Mon, Nov 4, 8:04 AM
Unknown Object (File)
Thu, Oct 24, 7:46 PM
Unknown Object (File)
Thu, Oct 24, 7:45 PM
Unknown Object (File)
Thu, Oct 24, 7:45 PM
Unknown Object (File)
Thu, Oct 24, 7:45 PM
Unknown Object (File)
Thu, Oct 24, 7:45 PM
Unknown Object (File)
Thu, Oct 24, 7:24 PM
Subscribers

Details

Summary

The existing TCP reassembly code has issues in that:

  1. It is non-performant
  2. It can use large numbers of segment queues causing worse CPU performance if its limits are raised
  3. It does not coalesce properly
  4. It will at times not send back the right SACK information to the caller so that incorrect sack's will be generated.

This review is a rewrite of this to fix these issues and make the
reassembly code both more efficient and also correct the SACK issue.

Test Plan
  1. Carefully test the reassembly code with Michael Tuexen's 215 test pkt-drill script.
  2. In a lab setting setup testing with various loss rates using both BB tracing and the coverage counters to be sure the code is being tested. Run the client sessions in those tests as https. Validate that even with the loss the clients do not abort SSL due to bad records (from mis-assembly).
  3. Begin widespread testing on NF caches watching for QoE impacts.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

Testing with INVAR found a problem with my late morning last optimization. If the
seq is behind us but overlaps we can't do the optimization.

Fixes an accounting problem shown by testing

@tuexen latest version of this review as of this comment with your updated packetdrill https://reviews.freebsd.org/P204

This version updates from the last to include all of Lawrence Stewarts comments
as well as those from Peter Lei. A significant change is altering the calculate
overhead function to also glean the mlast so that a second walk of the mbuf
chain is no longer needed.

Testing of the latest rcv-data test suite (now 216 tests) passes, will next move
into the lab to validate no data corruption with SSL.

@tuexen latest version of this review as of this comment with your updated packetdrill https://reviews.freebsd.org/P204

@kbowling Can you post (some of) the .out files? Can you also run the scrips by executing ./run-all-rcv-data-segments-tests from the tcp-testsuite/rcv-data-segments directory? It is likely that the failures are due to timing issues and the run-all-rcv-data-segments-tests passes -T 25000 to the run-tests.sh script allowing for a deviation of 25ms.

@tuexen latest version of this review as of this comment with your updated packetdrill https://reviews.freebsd.org/P204

@kbowling I just tested the rcv-data-segments tests on ppc (real hardware) and amd64 (real hardware and VM) and they all pass. So either there is a problem on ppc64 or it is a timing issue (should be fixed by using -T 25000).

@tuexen the latest version of this patch as of this comment and updating tcp-testsuite with higher -T looks good on ppc64

root@eclipz-nfsroot:~/tcp-testsuite/rcv-data-segments # ./run-all-rcv-data-segments-tests

....

Summary: Number of tests run:       216
         Number of tests passed:    216
         Number of tests failed:      0
         Number of tests timed out:   0
         Number of tests skipped:     0

This version incorporates all of Jonathans comments and suggested improvements. Thanks
Jonathan!!!

In D16626#356917, @rrs wrote:

This version incorporates all of Jonathans comments and suggested improvements. Thanks
Jonathan!!!

And, I want to thank @rrs for all his hard work on this!

I think there are still a few opportunities for optimization; however, this code has one big advantage: it works (at least as far as we can tell through code inspection and testing). :-)

The further optimizations (which we should pursue) can come after stable/12 branches and we are prepared to incur a little more risk to achieve greater optimization.

LGTM (with minor change noted in-line).

sys/netinet/tcp_reass.c
336 โ†—(On Diff #46870)

As noted elsewhere, I don't think you should increment t_segqlen here, since you aren't actually allocating a new queue entry. This same comment applies to tcp_reass_prepend() and tcp_reass_replace(). If I led you astray with previous comments, I apologize.

This revision is now accepted and ready to land.Aug 18 2018, 10:24 AM

Fix the silly cut and paste bug.

This revision now requires review to proceed.Aug 18 2018, 7:15 PM
sys/netinet/tcp_reass.c
336 โ†—(On Diff #46870)

Yeah stupid cut and paste bug this am.. its fixed with the last update

This revision is now accepted and ready to land.Aug 18 2018, 10:32 PM

I ran the packetdrill tests from rcv-data-segments and they all pass.

This revision was automatically updated to reflect the committed changes.