Page MenuHomeFreeBSD

Send FIN bit by itself
Needs ReviewPublic

Authored by rscheff on Apr 3 2021, 11:10 AM.

Details

Summary

Some clients (iOS, macOS, dragonflyBSD) will
SACK the FIN bit, if it is sent during loss
recovery while loss may be still ongoing.

One PR 254725 has been reported, where this may
have come into play. In order to avoid any
unexpected interactions due to SACKed FINs,
make sure to send the FIN only, when all
data is acknowledged in full by the client.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint OK
Unit
No Unit Test Coverage
Build Status
Buildable 38447
Build 35336: arc lint + arc unit

Event Timeline

Is it just that the peer SACKs the FIN, which is triggering this bug? Or are there more conditions involved?

This was the first working hypothesis for PR254725.

What I know so far:

  1. ECN Session with Loss (ECN and LRO may not interact nicely?)
  2. FIN was sent at the time of the panic
  3. The peer appears to have SACKed the FIN, with a hole for just the very last byte in the stream - but that is no longer present.
  4. The session is still in FastRecovery - with a Partial ACK to everything except the last byte (and snd_recover)

Alternatively, there may be a case, where snd_max is incremented twice for the FIN...

However, simply the peer acking the FIN itself doesn't do harm (it shouldn't even with the arguably degraded acceptance check).

The Panic could not be reproduced so far, by just recreating the above circumstances (triggering a loss recovery episode, shutting the session down (for the FIN to be sent), and partially ACKing all but one byte...

rscheff retitled this revision from Exclude FIN bit when included in snd_max to Send FIN bit by itself.Apr 9 2021, 7:57 AM
rscheff edited the summary of this revision. (Show Details)

How does this affect the TCP state?

How does this affect the TCP state?

The TCP state is unaffected by this change; once the (local) socket is close() or shutdown(), the ESTABLISHED state with transition as usual. Only the transmission of the FIN bit will be delayed until any outstanding loss recovery / reordering etc can been recovered from. This will in turn prevent snd_max to ("prematurely") increment to track the FIN bit - leading to other issues where snd_max is used to track recovery state (see original patch of this diff, for a number of places which would need to check if snd_max already contains the FIN bit or not).

Note that the (singular) panic in 254725 is still not fully understood - as the socket buffer did provide a few milliseconds earler the data byte for transmission, which was then gone and leading to the panic - while the TCPCB variables did NOT transition to where that byte would be allowed to be freed from the socket buffer.

I updated this diff after a comment by rrs@, that dealing with loss recovery while no longer in the ESTABLISHED state (after the FIN was sent or received) may be problematic and RACK / BBR sidestep any arising issues by not sending a FIN until regular data operations have wound down. (the state will transition, just continue to behave and follow the very same codepath as in ESTABLISHED).

sys/netinet/tcp_input.c
491 ↗(On Diff #86783)

This should be save, actually...