Page MenuHomeFreeBSD

Deal with DupACKs of SACK session with missing SACK Blocks
ClosedPublic

Authored by rscheff on Feb 12 2021, 11:25 PM.

Details

Summary

Once the SACK loss recovery mechanism is negotiated for
during the 3WHS, it is implicitly assumed, that every
DupACK will also always return with SACK Blocks.

Based on the changed information contained in the SACK
blocks, only the lost segments are retransmitted by
SACK loss recovery.

When these DupACKs contain no SACK blocks, the
expectation would be to fall back to NewReno loss
recovery.

However, this is not actually checked, and in this
rare case (middlebox scrubbing SACK blocks, or
an issue with the opposite SACK implementation)
TCP would revert to RTO based, one-by-one loss
recovery.

Test Plan
// Testing SACK loss recovery without SACK Blocks
// Active Client

--tolerance_usecs=50000

// Flush Hostcache
0.0 `sysctl net.inet.tcp.cc.algorithm=newreno`
+0.01 `sysctl net.inet.tcp.ecn.enable=2`
+0.01 `sysctl net.inet.tcp.do_prr=1`
+0.01 `sysctl net.inet.tcp.rfc6675_pipe=0`
+0.01 `sysctl net.inet.tcp.initcwnd_segments=10`
+0.01 `sysctl net.inet.tcp.hostcache.purgenow=1`

/*
// Client side
// Create a TCP socket.
+0.50 socket(..., SOCK_STREAM, IPPROTO_TCP) = 4
+0.01 fcntl(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0
+0.01 setsockopt(4, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0.01 setsockopt(4, SOL_SOCKET, SO_DEBUG, [1], 4) = 0
+0.01 setsockopt(4, SOL_SOCKET, SO_SNDBUF, [1048576], 4) = 0

// Establish a TCP connection with ECN

+0.04 connect(4, ..., ...) = -1 // EINPROGRESS

+0 >[noecn] S 0:0(0) win 65535 <mss 1460, nop,wscale 6,sackOK,TS val ... ecr 0>
+0 <[noecn] S. 0:0(0) ack 1 win 65535 <mss 1000, nop, wscale 10, sackOK, eol, nop>
+0 >[noecn] .   1:1(0) ack 1 <...>

+0.01 getsockopt(4, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
+0.01 fcntl(4, F_SETFL, O_RDWR) = 0   // set back to blocking

*/

// Server side
// Create a listening TCP socket.
+0.50 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0.01 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0.01 setsockopt(3, SOL_SOCKET, SO_SNDBUF, [1048576], 4) = 0
+0.01 setsockopt(3, SOL_SOCKET, SO_DEBUG, [1], 4) = 0
+0.01 bind(3, ..., ...) = 0
+0.01 listen(3, 1) = 0

// Establish a TCP connection with ECN
+0.04 <[noecn] S  0:0(0) win 65535 <mss 1000, wscale 10, sackOK, eol, nop, nop>
+0.00 >[noecn] S. 0:0(0) ack 1 win 65535 <mss 1460, nop, wscale 6, sackOK, eol, eol>
+0.00 <[noecn]  . 1:1(0) ack 1 win 65535
+0.00 accept(3, ..., ...) = 4


// Send IW plus 1 segment, check ECN bits
1.00 write(4, ..., 11000) = 11000
+0    >[noecn]  .      1:1001(1000) ack 1
+0    <[noecn] .     1:1(0) ack 1001 win 65535
+0    >[noecn]  .   1001:2001(1000) ack 1
+0    <[noecn] .      1:1(0) ack 2001 win 65535
+0    >[noecn]  .   2001:3001(1000) ack 1
+0    <[noecn] .      1:1(0) ack 3001 win 65535
+0    >[noecn]  .   3001:4001(1000) ack 1
+0    <[noecn] .      1:1(0) ack 4001 win 65535
+0    >[noecn]  .   4001:5001(1000) ack 1
+0    <[noecn] .      1:1(0) ack 5001 win 65535
+0    >[noecn]  .   5001:6001(1000) ack 1
+0    <[noecn] .      1:1(0) ack 6001 win 65535
+0    >[noecn]  .   6001:7001(1000) ack 1
+0    <[noecn] .      1:1(0) ack 7001 win 65535
+0    >[noecn]  .   7001:8001(1000) ack 1
+0    <[noecn] .      1:1(0) ack 8001 win 65535
+0    >[noecn]  .   8001:9001(1000) ack 1
+0    <[noecn] .      1:1(0) ack 9001 win 65535
+0    >[noecn]  .   9001:10001(1000) ack 1
+0    <[noecn] .      1:1(0) ack 10001 win 65535
+0    >[noecn]  P. 10001:11001(1000) ack 1

// Send few more "loss" packets
+0.0 setsockopt(4, SOL_SOCKET, SO_DEBUG, [1], 4) = 0

+0.01 write(4, ..., 10000) = 10000

+0    >[noecn]  .  11001:12001(1000) ack 1
+0    >[noecn]  .  12001:13001(1000) ack 1
+0    >[noecn]  .  13001:14001(1000) ack 1
+0    >[noecn]  .  14001:15001(1000) ack 1
+0    >[noecn]  .  15001:16001(1000) ack 1
+0    >[noecn]  .  16001:17001(1000) ack 1
+0    >[noecn]  .  17001:18001(1000) ack 1
+0    >[noecn]  .  18001:19001(1000) ack 1
+0    >[noecn]  .  19001:20001(1000) ack 1
+0    >[noecn]  P.  20001:21001(1000) ack 1

+0.01    <[noecn] .      1:1(0) ack 10001 win 65535
+0.01    <[noecn] .      1:1(0) ack 10001 win 65535
+0.01    <[noecn] .      1:1(0) ack 10001 win 65535
// Retransmission without ECT
+0    >[noecn] .  10001:11001(1000) ack 1
+0.01    <[noecn] .      1:1(0) ack 21001 win 65535

//More data should be ECT marked again, and
// have the CWR flag set now.
+0.01 write(4, ..., 1000) = 1000
+0    >[noecn] P. 21001:22001(1000) ack 1
+0    <[noecn] .       1:1(0) ack 22001 win 65535

+0.00 close(4) = 0
+0.01 >[noecn] F.  22001:22001(0) ack 1
+0.03 <[noecn] F.      1:1(0) ack 22002 win 65535
+0.01 >[noecn] .   22002:22002(0) ack 2

MFC after: 2 weeks
Sponsored by: NetApp, Inc.

Diff Detail

Repository
R10 FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

from transport: check if tcp_doopt is tracking incoming sack blocks only, when SACK_PERMIT is enabled, and reduce this to a single check (TOF_SACK presence).

As discussed in the transport call, checked if TOF_SACK would only be set by tcp_dooption() when TF_SACK_PERMIT was received/send. But this is not the case.

In tcp_dooptions(), the TOF_SACK is set, irrespective of the state of TF_SACK_PERMIT. (tcpcb is not readily available in that function, and I wouldn’t really want to change all the calls to it to include the current state of TF_SACK_PERMIT – unless this would be your preferred option

Since these conditionals are all in the slow path already, I believe having the additional check on the presence of an actual SACK block on a duplicate ACK, is not too much of a problem here.

This revision is now accepted and ready to land.Thu, Mar 25, 2:42 PM
This revision now requires review to proceed.Thu, Mar 25, 10:15 PM
This revision was not accepted when it landed; it landed in state Needs Review.Thu, Mar 25, 10:57 PM
This revision was automatically updated to reflect the committed changes.