Page MenuHomeFreeBSD

Fixing RTO timer during SACK loss recovery
Needs ReviewPublic

Authored by rscheff_gmx.at on Tue, Jan 14, 12:00 AM.

Details

Summary

Since its inception, a partial ack (where the right edge of continous
data moves forwards) would clear the RTO timer. Thus if some sack
retransmission other than the initial segment retransmitted is lost,
the session would then be at the mercy of some other - longer waiting -
timer to make forward progress. The very first retransmission is typically the
most prone to encounter a still filled up queue. But that retransmissions is
covered by the still running RTO timer from the in-sequence phase at that point.

It is counter-intuitive to disable the RTO timer completely, once the very
first of possibly many dozends of packets that need to be retransmitted, makes
it to the receiver. It is more logical to restart the RTO timer when some
additional in-sequence data is acked.

Short of lost retransmission detection, this will help with a more timely
loss recovery, when SACK retransmissions still encounter a congested
queue

Diff Detail

Lint
Lint OK
Unit
No Unit Test Coverage
Build Status
Buildable 28647
Build 26674: arc lint + arc unit

Event Timeline

Just found, that RFC2582, last paragraph of section 4 mentions the "slow-but-steady" reset (re-arm) of the RTO after each partial ack in the context of SACK:

one possibility for a more optimal
algorithm might be one that recovered more quickly from multiple
packet drops, and combined this with the Slow-but-Steady variant in
terms of resetting the retransmit timers.  We note, however, that
there is a limitation to the potential performance in this case in
the absence of the SACK option.

Note this was found in FBSD11.

Head seems to send out one rescue retransmission already (before explicitly implemented with 6675 support?), restarting the RTO timer again.

Maybe MFC this?