Page MenuHomeFreeBSD

Add Boundary and Overflow checks in Cubic formulas
ClosedPublic

Authored by rscheff_gmx.at on Feb 8 2019, 2:26 PM.

Details

Summary

This Diff is split off from D18954, and deals only with a number
of integer overflow scenarios and missing boundary checks.

Test Plan

To invoke the problem case, the session must have expirienced loss (to
pull down ssthresh to a low value). Thereafter, provided there is no
additional data loss but the session runs application limited for longer
periods (sender only sends a small amount of data, waits for less than
RTO interval, and sends only a few kB every 10s of milliseconds),
the cubic-calculated cwnd may end up negative or beyond INT_MAX.

Some overflows (w_tf) are apparent during low-latency sessions immediately,
other (ticks_since_cong) may only happen after very long periods of time
(~3 weeks) and cubic_cwnd only for large K/max_cwnd values.

Once the sender application starts to provide large amounts of data,
a line-rate traffic burst is unavoidable as cwnd may have grown to many
MB or GB in size.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

rscheff_gmx.at created this revision.Feb 8 2019, 2:26 PM
Harbormaster completed remote builds in B22389: Diff 53687.
rscheff_gmx.at edited the test plan for this revision. (Show Details)Feb 8 2019, 2:27 PM

While splitting these overflow checks out from the other Diff, I found that there is another typecast overflow, and potential negative overflow in cubic_cwnd.

First, the typecast for leftshifting ticks_since_cong was one bracket too far out to be effective (max ticks_since_cong to overflow after 2^23 milliseconds, or ~2:20hrs)

cwnd can easily become negative, for “large” values of K (correlates to large max_cwnd), and shortly after a cong / after-idle event, without any check for that.

Then the well known overflow on taking the cube of too large values.

Next, the multiplication by smss before the right-shift has potentially higher precision (of an rough estimation of K), but with a probable overflow after 11050 ticks (11 seconds) @ MTU9000.

Finally, a negative return value would be cast to a unsigned int later, thus again leading to an (incorrect) very large effective value for cwnd, so restricting any negative values to zero is the remedy.

sys/netinet/cc/cc_cubic.h
175 ↗(On Diff #53687)

typecast here applies after the leftshift. (2^23 will become -1)

204 ↗(On Diff #53687)

cwnd could still be negative here. The function returns unsigned long, which could typecast this wrong.

  • replace one division with a multiplication, with proper brackets. Taken from D8021

I prefer D18982 to be submitted. If D18982 is submitted, we don't need this patch as D18982 covers everything here.

  • put magic number as #define

I learned that the SCE project group is looking into qualification of TCP Cubic; As R. Grimes is part of that group, adding him as a reviewer, to validate the functional changes to Cubic are sound.

rgrimes accepted this revision.Sat, Nov 16, 4:14 AM

I usually do the comparison separate from formula when they already exist, but either is ok with me.

This revision is now accepted and ready to land.Sat, Nov 16, 4:14 AM