This is reported by Bhaskar Pardeshi from VMware.
- Group Reviewers
- rGf5abdb03119a: tcp: fix a bug where unshifting should be put last in tcp_get_srtt()
Verified with siftr
analysis for example:
old minimal srtt_usecs = (1 >> TCP_RTT_SHIFT) * tick = (1 >> 5) * 1000 = 0
new minimal srtt_usecs = (1 * tick) >> TCP_RTT_SHIFT = (1 * (1000)) >> 5 = 31
I think you should add a comment that you use srtt = TICKS_2_USEC(tp->t_srtt) >> TCP_RTT_SHIFT instead of srtt = TICKS_2_USEC(tp->t_srtt >> TCP_RTT_SHIFT) to increase the precision. While doing this, you might want to replace the double space after = by a single one.
For what its worth...
While this does indeed fix a bug where we lose precision, it also reduces the maximum SRTT value which can be stored in ticks without risking an overflow during the TICKS_2_USEC conversion.
Previously, the value only overflowed when (srtt >> 5) exceeded (UINT32_MAX / hz). Now, the value will overflow if the unshifted srtt exceeds (UINT32_MAX / hz).
Where hz == 1000, this means we will now overflow if srtt > 4,294,967, which translates to roughly 134.217 seconds. Previously, with hz == 1000, we would overflow if srtt > 137,438,975, which translates to roughly 4,294.967 seconds.
It seems unlikely we would see an srtt of over two minutes; however, it is more likely we would see that than an srtt of over 1 hour. Also, using hz values over 1000 (does anyone do this?) will cause the code to be more susceptible to an overflow.
I'm not suggesting a change at this point. Rather, I'm making an observation in case something strange pops up in the future or someone knows of a use case where this will be a problem.
(If necessary, the fix would be to calculate the srtt using a uint64_t and truncate it to uint32_t when returning it.)