Page MenuHomeFreeBSD

TCP Stacks, Improve rack to better handle reordering
Needs ReviewPublic

Authored by rrs on Wed, Nov 19, 7:41 PM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Dec 4, 6:19 AM
Unknown Object (File)
Wed, Dec 3, 5:52 AM
Unknown Object (File)
Fri, Nov 28, 5:07 AM
Unknown Object (File)
Mon, Nov 24, 4:06 AM
Unknown Object (File)
Fri, Nov 21, 9:05 PM
Unknown Object (File)
Thu, Nov 20, 10:31 AM
Unknown Object (File)
Thu, Nov 20, 6:31 AM
Unknown Object (File)
Thu, Nov 20, 1:07 AM
Subscribers

Details

Reviewers
tuexen
Group Reviewers
transport
Summary

With a recent bug in the igb (and a few other) driver LRO mis-queuing, rack did things ok, better
than the base stack, due to the rack reordering protections in rack, but there was still room for improvements.
When a series of packets are completely mis-ordered you often times can get the acks shortly after you have
entered recovery and retransmitted the first of the packets indicated in the sack stream. Then the cum-ack
arrives basically acking all those packets. If you look at the time from when you sent the packet to when the
ack came back you can quickly determine that the ack was not to what you just transmitted but instead
was original and you had a completely false recovery entry. Dropping out of that you can then restore the
congestion state and continue on your way. The Dup-acks that also arrive help increase your reordering windows
which makes you less likely to repeat the scenario.

Test Plan

There is a first test you can do with a packet drills script which will attach below. But a far better thing
is to setup the igb bug and test across between a lab in Germany and the US.. I did this and have a nice
clean BBlog of the fixes in action (and very low retransmission rate due to it). Available by request if you
are interested.

Here is the packet drill script that Michael Tuexen created with maybe a tweak from me :)

--ip_version=ipv4

0.0000`kldload -n tcp_rack`
+0.0000`kldload -n cc_newreno`
+0.0000`sysctl kern.timecounter.alloweddeviation=0`

+0.0000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0.0000 setsockopt(3, IPPROTO_TCP, TCP_LOG, [4], 4) = 0
+0.0000 setsockopt(3, IPPROTO_TCP, TCP_FUNCTION_BLK, {function_set_name="rack",

pcbcnt=0}, 36) = 0

+0.0000 setsockopt(3, IPPROTO_TCP, TCP_CONGESTION, "newreno", 8) = 0
+0.0000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0.0000 bind(3, ..., ...) = 0
+0.0000 listen(3, 1) = 0
+0.0000 < S 0:0(0) win 65535 <mss 1000,sackOK,eol,eol>
+0.0000 > S. 0:0(0) ack 1 win 65535 <mss 1460,sackOK,eol,eol>
+0.0500 < . 1:1(0) ack 1 win 65535
+0.0000 accept(3, ..., ...) = 4
+0.00 setsockopt(4, IPPROTO_TCP, TCP_LOG, [4], 4) = 0
+0.0000 close(3) = 0
Trigger an initial RTT measurement of 50ms.
+0.0000 send(4, ..., 1000, 0) = 1000
+0.0000 > P. 1:1001(1000) ack 1 win 65535
+0.0500 < . 1:1(0) ack 1001 win 65535
Send 4 full sized frames
+0.5000 send(4, ..., 4000, 0) = 4000
+0.0000 > . 1001:2001(1000) ack 1 win 65535
+0.0000 > . 2001:3001(1000) ack 1 win 65535
+0.0000 > . 3001:4001(1000) ack 1 win 65535
+0.0000 > P. 4001:5001(1000) ack 1 win 65535
After an RTT get an ack for the fourth, third, and second segment.
+0.0500 < . 1:1(0) ack 1001 win 65535 <nop,nop,sack 4001:5001>
+0.0000 < . 1:1(0) ack 1001 win 65535 <nop,nop,sack 3001:5001>
+0.0000 < . 1:1(0) ack 1001 win 65535 <nop,nop,sack 2001:5001>
Retransmit the missing segment after the reordering window has passed.
+0.0125 > . 1001:2001(1000) ack 1 win 65535
+0.0500 < . 1:1(0) ack 5001 win 65535
+1.0000 < F. 1:1(0) ack 5001 win 65535
+0.0000 > . 5001:5001(0) ack 2 win 65535
+0.0000 close(4) = 0
+0.0000 > F. 5001:5001(0) ack 2 win 65535
+0.0500 < . 2:2(0) ack 5002 win 65535

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped