Page MenuHomeFreeBSD

tcp: fix a bug where unshifting should be put last in tcp_get_srtt()
ClosedPublic

Authored by cc on May 25 2023, 5:19 PM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Sep 27, 4:33 PM
Unknown Object (File)
Sun, Sep 24, 8:02 AM
Unknown Object (File)
Wed, Sep 6, 12:06 AM
Unknown Object (File)
Tue, Sep 5, 11:05 PM
Unknown Object (File)
Fri, Sep 1, 8:44 AM
Unknown Object (File)
Fri, Sep 1, 2:30 AM
Unknown Object (File)
Aug 22 2023, 9:19 AM
Unknown Object (File)
Aug 14 2023, 3:17 PM

Details

Summary

This is reported by Bhaskar Pardeshi from VMware.

Test Plan

Verified with siftr

analysis for example:
old minimal srtt_usecs = (1 >> TCP_RTT_SHIFT) * tick = (1 >> 5) * 1000 = 0
new minimal srtt_usecs = (1 * tick) >> TCP_RTT_SHIFT = (1 * (1000)) >> 5 = 31

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 51696
Build 48587: arc lint + arc unit

Event Timeline

cc requested review of this revision.May 25 2023, 5:19 PM
sys/netinet/tcp_subr.c
4665

I think you should add a comment that you use srtt = TICKS_2_USEC(tp->t_srtt) >> TCP_RTT_SHIFT instead of srtt = TICKS_2_USEC(tp->t_srtt >> TCP_RTT_SHIFT) to increase the precision. While doing this, you might want to replace the double space after = by a single one.

Add comment above this fix to explain why it can improve precision.

cc marked an inline comment as done.May 25 2023, 10:27 PM
This revision is now accepted and ready to land.May 26 2023, 7:42 AM

For what its worth...

While this does indeed fix a bug where we lose precision, it also reduces the maximum SRTT value which can be stored in ticks without risking an overflow during the TICKS_2_USEC conversion.

Previously, the value only overflowed when (srtt >> 5) exceeded (UINT32_MAX / hz). Now, the value will overflow if the unshifted srtt exceeds (UINT32_MAX / hz).

Where hz == 1000, this means we will now overflow if srtt > 4,294,967, which translates to roughly 134.217 seconds. Previously, with hz == 1000, we would overflow if srtt > 137,438,975, which translates to roughly 4,294.967 seconds.

It seems unlikely we would see an srtt of over two minutes; however, it is more likely we would see that than an srtt of over 1 hour. Also, using hz values over 1000 (does anyone do this?) will cause the code to be more susceptible to an overflow.

I'm not suggesting a change at this point. Rather, I'm making an observation in case something strange pops up in the future or someone knows of a use case where this will be a problem.

(If necessary, the fix would be to calculate the srtt using a uint64_t and truncate it to uint32_t when returning it.)