This Diff is split off from D18954, and deals only with a number
of integer overflow scenarios and missing boundary checks.
Details
- Reviewers
lstewart cc rgrimes tuexen - Group Reviewers
transport - Commits
- rS362858: MFC r354774:
rS354774: Add boundary and overflow checks to the formulas used in the TCP CUBIC
To invoke the problem case, the session must have expirienced loss (to
pull down ssthresh to a low value). Thereafter, provided there is no
additional data loss but the session runs application limited for longer
periods (sender only sends a small amount of data, waits for less than
RTO interval, and sends only a few kB every 10s of milliseconds),
the cubic-calculated cwnd may end up negative or beyond INT_MAX.
Some overflows (w_tf) are apparent during low-latency sessions immediately,
other (ticks_since_cong) may only happen after very long periods of time
(~3 weeks) and cubic_cwnd only for large K/max_cwnd values.
Once the sender application starts to provide large amounts of data,
a line-rate traffic burst is unavoidable as cwnd may have grown to many
MB or GB in size.
Diff Detail
- Lint
Lint Passed - Unit
No Test Coverage - Build Status
Buildable 22432 Build 21589: arc lint + arc unit
Event Timeline
While splitting these overflow checks out from the other Diff, I found that there is another typecast overflow, and potential negative overflow in cubic_cwnd.
First, the typecast for leftshifting ticks_since_cong was one bracket too far out to be effective (max ticks_since_cong to overflow after 2^23 milliseconds, or ~2:20hrs)
cwnd can easily become negative, for “large” values of K (correlates to large max_cwnd), and shortly after a cong / after-idle event, without any check for that.
Then the well known overflow on taking the cube of too large values.
Next, the multiplication by smss before the right-shift has potentially higher precision (of an rough estimation of K), but with a probable overflow after 11050 ticks (11 seconds) @ MTU9000.
Finally, a negative return value would be cast to a unsigned int later, thus again leading to an (incorrect) very large effective value for cwnd, so restricting any negative values to zero is the remedy.
sys/netinet/cc/cc_cubic.h | ||
---|---|---|
175 | typecast here applies after the leftshift. (2^23 will become -1) | |
204 | cwnd could still be negative here. The function returns unsigned long, which could typecast this wrong. |
I prefer D18982 to be submitted. If D18982 is submitted, we don't need this patch as D18982 covers everything here.
I learned that the SCE project group is looking into qualification of TCP Cubic; As R. Grimes is part of that group, adding him as a reviewer, to validate the functional changes to Cubic are sound.
I usually do the comparison separate from formula when they already exist, but either is ok with me.