Page MenuHomeFreeBSD

Initial retransmit timeout improperly set
ClosedPublic

Authored by rrs on May 8 2021, 8:42 AM.

Details

Summary

In some cases rack end up with an incorrect RTT set initially. In particular
the test case is where we have a long RTT, the server sends the
initial message after the 3-way handshake. Srtt and rttvar
end up the correct values, but tp->t_rtxcur does not. Usually
quite a smaller value. This causes all kinds of trouble in 2 TLP's and
finally a RXT that knock the cwnd to 1 MSS. The consequences
of this are the connection crawls.

What should be happening is we call the proper t_rxtcur set macro
after setting up properly the srtt and rttvar.

Test Plan

run the particular sendfile tests where the server
sends first over a long RTT path.

This pkt drill script can be used to see the bogus values that end up in
the hostcache after it runs and updates the hc from rack.

With the fix, the hc values should look normal.

--ip_version=ipv4

+0.00 sysctl -w net.inet.tcp.syncookies_only=0
+0.00 sysctl -w net.inet.tcp.syncookies=1
+0.00 sysctl -w net.inet.tcp.rfc1323=1
+0.00 sysctl -w net.inet.tcp.sack.enable=1
+0.00 sysctl -w net.inet.tcp.ecn.enable=2
Create a TCP endpoint in the ESTABLISHED state.
+0.00 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0.00 fcntl(3, F_GETFL) = 0x02 (flags O_RDWR)
+0.00 bind(3, ..., ...) = 0
+0.00 listen(3, 1) = 0
+0.000 < S 0:0(0) win 1500 <mss 1460, sackOK, eol, eol>
+0.000 > S. 0:0(0) ack 1 win 65535 <...>
+0.190 < . 1:1(0) ack 1 win 40000
+0.000 accept(3, ..., ...) = 4
+0.00 setsockopt(4, IPPROTO_TCP, TCP_LOG, [4], 4) = 0
+0.00 setsockopt(4, IPPROTO_TCP, TCP_FUNCTION_BLK, {function_set_name="rack_latest", pcbcnt=0}, 36) = 0
+0.000 write(4, ..., 5) = 5
+0.00 > P. 1:6(5) ack 1 win 65535
+0.200 < . 1:1(0) ack 6 win 40001
+0.100 write(4, ..., 1448) = 1448
+0.00 > P. 6:1454 (1448) ack 1 win 65535
+0.200 < . 1:1(0) ack 1454 win 40001
+0.100 write(4, ..., 1448) = 1448
+0.00 > P. 1454:2902 (1448) ack 1 win 65535
+0.200 < . 1:1(0) ack 2902 win 40001
+0.100 write(4, ..., 1448) = 1448
+0.00 > P. 2902:4350 (1448) ack 1 win 65535
+0.200 < . 1:1(0) ack 4350 win 40001
Tear it down.
+2.100 close(4) = 0
+0.00 > F. 4350:4350 (0) ack 1 win 65535
+0.200 < F. 1:1(0) ack 4351 win 40002
+0.00 > . 4351:4351 (0) ack 2 win 65535

Diff Detail

Repository
rG FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

rrs requested review of this revision.May 8 2021, 8:42 AM

Turns out the problem is far deeper. There are at least
a couple of interactions here.

  1. Rack keeps its srtt/rttvar in microseconds (no longer the 5 bit fractional stuff). When we destroy a tcb, the fini() function needs to be called *before* we update the host cache.
  1. The hostcache the way it was being called could be called multiple times for the same TCB which is not good.
  1. When rack inits it needs to do its own query of the hostcache and then properly translate the information into its representation.
sys/netinet/tcp_stacks/rack.c
6577

You are definitely missing a tcp_hc_get() call here.

General question: Can't you call cc_conn_init() here first and then do the conversion to the RACK internal format? This would reduce the code duplication...

sys/netinet/tcp_stacks/rack.c
6577

Opps yes I am :)

Address Michaels comments and further do some code reducing (create a conversion
function that all places use to get srtt/rttvar in rack format).

This revision is now accepted and ready to land.May 10 2021, 3:21 PM