Page MenuHomeFreeBSD

Send a final ACK to local connections
AbandonedPublic

Authored by jtl on Nov 2 2017, 4:15 PM.
Tags
None
Referenced Files
Unknown Object (File)
Dec 22 2023, 4:47 AM
Unknown Object (File)
Dec 20 2023, 3:43 AM
Unknown Object (File)
Nov 25 2023, 8:48 AM
Unknown Object (File)
Aug 17 2023, 3:07 PM
Unknown Object (File)
Aug 8 2023, 12:39 AM
Unknown Object (File)
May 14 2023, 6:14 AM
Unknown Object (File)
Apr 25 2023, 9:24 AM
Unknown Object (File)
Apr 24 2023, 9:36 AM

Details

Reviewers
gnn
rrs
Group Reviewers
transport
Summary

When local TCP connections transition to TIME_WAIT, we (by default) block those connections from actually entering the TIME_WAIT state. (This optimization is controlled by a sysctl.) This optimization makes sense as there is little-to-no chance that local packets are lost or reordered. However, this optimization relies on one critical element: that both ends of the connection will actually shut down correctly.

This assumption is violated when the optimization is enabled. When the optimization is enabled and a local TCP session tries to transition from FINWAIT-2 to TIME_WAIT, the kernel doesn't actually send an ACK for the final FIN before closing the TCP connection. As a result, the other side must retransmit its FIN. As long as the blackhole option is not enabled, the kernel then responds with a RST because it no longer has a matching session. In the worst case, if the blackhold option is enabled, the remote side will need to retransmit its FIN many times before finally timing out the session.

The solution is fairly simple: actually send a FIN before we close the session.

As a side effect, this solution also (partially) fixes something noted in a comment in tcp_timewait.c: the timewait ACK should include timestamps. This makes it so the first ACK will include timestamps. (Subsequent ones still will not.)

Test Plan

I discovered this using the black box tracing. Using that same tracing, I have verified that the fix does indeed result in us sending the inf

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 12402
Build 12678: arc lint + arc unit

Event Timeline

My fault, thanks for fixing.

I don't like that we now allocate memory to meet the optimization. Can we try to change this function so that in case of V_nolocaltimewait we would create 'struct tcptw' on stack, fill it in, then call tcp_twrespond(). Then for non-optimized case do all the referencing maneuvers, for optimized case just call tcp_close()? I can help with that.