Page MenuHomeFreeBSD

Phase 2 to add Proportional Rate Reduction (RFC6937) to FreeBSD
Needs ReviewPublic

Authored by rscheff on Jan 18 2019, 4:45 PM.



Using Aris Angelogiannopoulos patch from 2013 as base, and
facilitating the improved, single pass accounting in the
SACK scoreboard (D18624). Also switched from sack_newdata
to snd_recover (D18811).

May need to revisit the pipe calculations throughout when
adding RFC6675 pipe calculations as default.

Diff Detail

Lint OK
No Unit Test Coverage
Build Status
Buildable 27999
Build 26158: arc lint + arc unit

Event Timeline

rscheff created this revision.Jan 18 2019, 4:45 PM

Are you aware of the RACK stack in /usr/src/sys/netinet/tcp_stacks/? It probably does everything you want

Are you aware of the RACK stack in /usr/src/sys/netinet/tcp_stacks/? It probably does everything you want

Yes, I am;

However, adding gradual improvements to the known packet-counting base stack is less problematic over here than switching to an entire different stack with possibly significant different behaviour, that relies on high-speed timers. At least over here, validating the single-session throughput of RACK will be a major obstacle (RACK was tested by the contributors to scale to 10Gbps, while servicing a couple hundred/thousand sessions, not a single session).

While the jury is out which stack can deliver better singel-stream at very high bandwidth and very low latency (1 TOR shallow-buffer switch), why not further improve the base stack and make it less prone to self-inflicted packet loss (see Rationale for Rate-halving and PRR @ IETF). Packet-counting / ACK-clocked stacks possibly deliver better throughput in these environments.

PRR adresses two major issues:
a) self-inflicted packet losses during loss recovery, which can easily lead to RTO. Reason here is, that for the 1st half of the window, 3517 will not send any segments, and for the 2nd half of the window - provided sufficient ACKs (one per delivered segment) are returned - continues to transmit at the original rate - likely before the overloaded buffer got drained enough.
b) if insufficient ACKs are returned, e.g. multiple packet losses, or ACK thinning by the network, or (this is what triggered my work her) a buggy GRO TCP receiver interaction returning one ACK per superframe only, while the receiver has discontinous data (OOO), 3517 will shrink the final cwnd more than it needs to - in the worst case, down to 1 MSS. This will happen, when more than 50% of the ACKs during the loss episode don't make it back to the sender (or, in my case, ~ every 3rd data segment gets ACKs, while OOO).

I worked with rrs to suss out some bugs on this use case when it was new at job-1. I recently tested it to 40gbps single flow on my own pedestrian hardware (1st gen Xeon e3, ixl on one side chelsio T5 on the other) and the CPU usage didn't stand out to me. Yes there will be more timer soft interrupts but there are actually a lot of branching and date movement optimizations in the RACK stack that also help this single flow case. I'm not necessarily saying don't improve the default stack, just that it would probably be a good thing for your customers if you had the totality of the RACK stack improvements in there.

This is a visual code read only review, I have not compiled or tested the code.


Mixing white space fixes makes more lines of code to review. White space fixes should just be committed individually and ASAP.


Duplicated code from below copied into above should be reduciable by use of complicated if expression, which would then more closely match the comment above.

if (V_tcp_do_prf || tp->t_flags & TF_SACK_PERMIT) {

rscheff updated this revision to Diff 65298.Dec 6 2019, 5:56 AM
  • minimal diff for prr only
rscheff marked 2 inline comments as done.Dec 6 2019, 5:59 AM

Note that the above diff will *not* compile by itself, as it is dependent on D18624. It should therefore only be applies after the necessary changes to improve the SACK scoreboard accounting and supporting code there.

This patch needs additional unit testing, to verify that this patch does what it claims to do, and have no detrimental side effects.