Page MenuHomeFreeBSD

Add Boundary and Overflow checks in Cubic formulas
Needs ReviewPublic

Authored by rscheff_gmx.at on Feb 8 2019, 2:26 PM.

Details

Summary

This Diff is split off from D18954, and deals only with a number
of integer overflow scenarios and missing boundary checks.

Test Plan

To invoke the problem case, the session must have expirienced loss (to
pull down ssthresh to a low value). Thereafter, provided there is no
additional data loss but the session runs application limited for longer
periods (sender only sends a small amount of data, waits for less than
RTO interval, and sends only a few kB every 10s of milliseconds),
the cubic-calculated cwnd may end up negative or beyond INT_MAX.

Some overflows (w_tf) are apparent during low-latency sessions immediately,
other (ticks_since_cong) may only happen after very long periods of time
(~3 weeks) and cubic_cwnd only for large K/max_cwnd values.

Once the sender application starts to provide large amounts of data,
a line-rate traffic burst is unavoidable as cwnd may have grown to many
MB or GB in size.

Diff Detail

Lint
Lint OK
Unit
No Unit Test Coverage
Build Status
Buildable 22432
Build 21589: arc lint + arc unit

Event Timeline

rscheff_gmx.at created this revision.Feb 8 2019, 2:26 PM
Harbormaster completed remote builds in B22389: Diff 53687.
rscheff_gmx.at edited the test plan for this revision. (Show Details)Feb 8 2019, 2:27 PM

While splitting these overflow checks out from the other Diff, I found that there is another typecast overflow, and potential negative overflow in cubic_cwnd.

First, the typecast for leftshifting ticks_since_cong was one bracket too far out to be effective (max ticks_since_cong to overflow after 2^23 milliseconds, or ~2:20hrs)

cwnd can easily become negative, for “large” values of K (correlates to large max_cwnd), and shortly after a cong / after-idle event, without any check for that.

Then the well known overflow on taking the cube of too large values.

Next, the multiplication by smss before the right-shift has potentially higher precision (of an rough estimation of K), but with a probable overflow after 11050 ticks (11 seconds) @ MTU9000.

Finally, a negative return value would be cast to a unsigned int later, thus again leading to an (incorrect) very large effective value for cwnd, so restricting any negative values to zero is the remedy.

sys/netinet/cc/cc_cubic.h
175

typecast here applies after the leftshift. (2^23 will become -1)

204

cwnd could still be negative here. The function returns unsigned long, which could typecast this wrong.

  • replace one division with a multiplication, with proper brackets. Taken from D8021