Page MenuHomeFreeBSD

Send CWR whenever a ECN-enabled session runs into RTO
ClosedPublic

Authored by rscheff_gmx.at on Jan 10 2020, 12:35 AM.

Details

Summary

For some reason, an RTO congestion signal would not set up CWR to be sent out, while reducing cwnd down to a single MSS, as per RFC. DCTCP had addressed this particular problem before, but this is generic to all ECN-enabled flows, thus changing the CWR flag in the generic cc cong signal wrapper.

This can help improve performance after an RTO, as some stacks will use the CWR flag as hint to perform an immediate ACK, rather than having a 50/50 chance of running into a delayed ACK timeout, before the receiver sends out an ACK to grow cwnd to 2 MSS again.

Test Plan

packetdrill scrip: Set up a ecn-enabled session, and simulate the loss of one data segment and the retransmission of that segment. This will result in a RTO for the segment. That RTO retransmitted segment should have the CWR flag set.

Diff Detail

Lint
Lint OK
Unit
No Unit Test Coverage
Build Status
Buildable 28586
Build 26628: arc lint + arc unit

Event Timeline

rgrimes accepted this revision.Jan 10 2020, 5:14 AM

In addition, since the RTO retransmission is not CWR-marked and cwnd is reduced to 1 MSS, ECE remains set by the receiver under RFC3168.

Thus cwnd will not grow in the first RTT, as the still active ECE will prevent this. So this always runs one or two times into the receivers delayed ACK timeout:

Sender Receiver
RTO
segment (without CWR) -->

[delayed ACK timeout]

<-- ACK with ECE
(cwnd not increased since the ECE is a
new congestion event to the sender)

segment (with CWR) -->
                                delayed ACK timeout
<-- ACK without ECE

(cwnd may grow now)

  • rto cwr now in rack too

A packetdrill script to test both D23119 and D23118

// Testing ECT on SACK rexmit and CWR on RTO

--tolerance_usecs=200000

+0   `sysctl net.inet.tcp.cc.algorithm=newreno`
+0.02 `sysctl net.inet.tcp.initcwnd_segments=10`
+0.02 `sysctl net.inet.tcp.ecn.enable=1`
+0.02 `sysctl net.inet.tcp.hostcache.purgenow=1`

// Create a listening TCP socket.
+0.10 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0.01 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0.01 setsockopt(3, SOL_SOCKET, SO_SNDBUF, [1048576], 4) = 0
+0.01 bind(3, ..., ...) = 0
+0.01 listen(3, 1) = 0

// Establish a TCP connection.
+0.04 <[noecn] SEW     0:0(0) win 65535 <mss 1000, sackOK, wscale 10, eol, nop, nop >
+0.00 >[noecn] SE.     0:0(0) ack 1 win 65535 <...>
+0.00 <[noecn]   .     1:1(0) ack 1 win 65535
+0.00 accept(3, ..., ...) = 4

// Send IW plus 1 segment, check ECN bits
0.4 write(4, ..., 15000) = 15000
+0    >[ect0]    .     1:1001(1000)  ack 1 <...>
+0    >[ect0]    .  1001:2001(1000)  ack 1 <...>
+0    >[ect0]    .  2001:3001(1000)  ack 1 <...>
+0    >[ect0]    .  3001:4001(1000)  ack 1 <...>
+0    >[ect0]    .  4001:5001(1000)  ack 1 <...>
+0    >[ect0]    .  5001:6001(1000)  ack 1 <...>
+0    >[ect0]    .  6001:7001(1000)  ack 1 <...>
+0    >[ect0]    .  7001:8001(1000)  ack 1 <...>
+0    >[ect0]    .  8001:9001(1000)  ack 1 <...>
+0    >[ect0]    .  9001:10001(1000) ack 1 <...>
+0.01 <[noecn]   .     1:1(0)        ack 1 win 65535 <sack 1001:2001, eol, nop>
+0    >[ect0]    . 10001:11001(1000) ack 1 <...>
+0.01 <[noecn]   .     1:1(0)        ack 1 win 65535 <sack 1001:3001, eol, nop>
+0    >[ect0]    . 11001:12001(1000) ack 1 <...>
+0.01 <[noecn]   .     1:1(0)        ack 1 win 65535 <sack 1001:4001, eol, nop>

// the SACK retransmission should be not ECT
+0    >[noecn]  W.     1:1001(1000)  ack 1 <...>
+0.01 <[noecn]   .     1:1(0)        ack 1 win 65535 <sack 1001:5001, eol, nop>
+0.01 <[noecn]   .     1:1(0)        ack 1 win 65535 <sack 1001:6001, eol, nop>
+0.01 <[ce]     W.     1:1(0)        ack 1 win 65535 <sack 1001:7001, eol, nop>
+0.01 <[noecn]   .     1:1(0)        ack 1 win 65535 <sack 1001:8001, eol, nop>
+0.01 <[noecn]   .     1:1(0)        ack 1 win 65535 <sack 1001:9001, eol, nop>
+0    >[ect0]   E. 12001:13001(1000) ack 1 <...>
+0.01 <[noecn]  W.     1:1(0)        ack 1 win 65535 <sack 1001:10001, eol, nop>
+0    >[ect0]    . 13001:14001(1000) ack 1 <...>
+0.01 <[noecn]   .     1:1(0)        ack 1 win 65535 <sack 1001:11001, eol, nop>
+0    >[ect0]   P. 14001:15001(1000) ack 1 <...>
+0.01 <[noecn]   .     1:1(0)        ack 1 win 65535 <sack 1001:12001, eol, nop>
+0.01 <[noecn]   .     1:1(0)        ack 1 win 65535 <sack 1001:13001, eol, nop>
+0.01 <[noecn]   .     1:1(0)        ack 1 win 65535 <sack 1001:14001, eol, nop>
+0.01 <[noecn]   .     1:1(0)        ack 1 win 65535 <sack 1001:15001, eol, nop>

// the RTO retransmission should have CWR set
1.4   >[noecn]  W.     1:1001(1000)  ack 1 <...>
+0.3  <[noecn]   .     1:1(0)        ack 15001 win 65535

+0.01 close(4) = 0
+0.01 >[noecn]  F. 15001:15001(0)    ack 1 <...>
+0.10 <[noecn]  F.     1:1(0)        ack 15002 win 65535
+0.02 >[noecn]   . 15002:15002(0)    ack 2 <...>

Did not find any problem in bsd11.

tuexen accepted this revision.Sat, Jan 25, 1:44 PM
This revision is now accepted and ready to land.Sat, Jan 25, 1:44 PM

MFC to stable/12, stable/11 was suggested on the bugtracker...