Page MenuHomeFreeBSD

Restrict cwnd growth on app-limited flows
ClosedPublic

Authored by rscheff on Sep 26 2019, 10:00 AM.
Tags
None
Referenced Files
F80175998: D21798.id.diff
Thu, Mar 28, 9:31 PM
Unknown Object (File)
Tue, Mar 19, 4:00 PM
Unknown Object (File)
Tue, Mar 19, 2:33 PM
Unknown Object (File)
Tue, Mar 19, 12:22 AM
Unknown Object (File)
Feb 22 2024, 3:21 PM
Unknown Object (File)
Jan 31 2024, 3:27 AM
Unknown Object (File)
Dec 20 2023, 6:40 AM
Unknown Object (File)
Dec 10 2023, 1:25 PM
Subscribers

Details

Summary

As long as no congestion event has happened, CWnd can grow "unbounded".
This can happen when the sender and receiver can operate at full wire
speed at all times. In this scenario, cwnd is not relevant to regulate
the data flow however.

The more interesting and realistic scenario is when an application
starts off transmitting at a limited bandwidth at first, to transition
into bulk transfer mode later during the evolution of the session.

In this case, cwnd may have grown to a significant size during
slow-start (uninterrupted due to no congestion event while only little
bandwidth was actually excercised). Once the application transistions
into the bulk transfer phase, a huge burst of (line-rate) data can
be transmitted by the current default BSD TCP stack, very likely to
run into self-inflicted significant packet loss.

This change restricts the growth of cwnd to no more than twice of the
current flightsize. This is along the guidance presented in New
Congestion Window Validation (NewCWV) RFC7661, but only addresses one
specific corner case, where transmission bursts may happen.

Test Plan

uperf profile to simulate an initially app-limited tcp flow, which
eventually starts to send bulk data.

iperf transfers on back-to-back links would also show the unlimited
cwnd growth (but no detrimental effects, as the peak sending rate
can not exceed the senders line rate anyway).

Diff Detail

Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 27713
Build 25913: arc lint + arc unit

Event Timeline

The uperf profile to simulate an application transitioning from app-limited into bulk transfer mode.

Evolution of cwnd - 2nd flow starts at +12sec before this patch, using cc_cubic

Evolution of cwnd - 2nd flow starts at +12sec, after this pacht, using cc_cubic

from the transport call:

o) make similar change to RACK stack also
o) add a sysctl to disable any code change related NewCVW

  • adding new sysctl to control NewCWV related behavior, disabled by default. extend man page
cc requested changes to this revision.Oct 25 2019, 2:47 PM
cc added inline comments.
share/man/man4/tcp.4
545

Do you mean "self-inflicted"?

sys/netinet/tcp_input.c
305–308

Can save some parentheses like this:

if ((!V_tcp_do_newcwv && tp->snd_cwnd <= tp->snd_wnd) ||
    (V_tcp_do_newcwv && tp->snd_cwnd <= tp->snd_wnd &&
     tp->snd_cwnd < tcp_compute_pipe(tp) * 2))
This revision now requires changes to proceed.Oct 25 2019, 2:47 PM
  • minor typo in man
  • add sysctl variable properly
  • bump man page timestamp

Should be ready to land now.

sys/netinet/tcp_input.c
305–308

leaving the brackets aroung the arithmetic evaluations to be clear about evaluation ordering, while removing other superfluous brackets here.

cc added inline comments.
sys/netinet/tcp_input.c
307

Need one more space of indent on the third line to indicate the third line is part of the second condition:

(V_tcp_do_newcwv && (tp->snd_cwnd <= tp->snd_wnd) &&
(tp->snd_cwnd < (tcp_compute_pipe(tp) * 2)))) <== add one more space

  • add single 3rd level indentation space to long if clause
sys/netinet/tcp_input.c
307

FBSD style guide states, to use only tab (8 space) first level indentation, and to break up any lengthy single line statement with 4 space 2nd level indentation.

However, there is precedent to this, e.g. tcp_input.c:1635, 1835, 1853, 2371 and more (and couple ~1/3 few instances, where 2nd level indentation was done with one superfluous space).

Since IMHO this does improve readability, will add this space.

rrs requested changes to this revision.Nov 21 2019, 8:14 PM
rrs added a subscriber: rrs.

You cannot use tcp_compute_pipe() with rack. It does not use
the same variables as the default stack. Instead you must use the
ctf_flight_size() function to get whats in flight.

Please see other instances in rack for proper use of the function.

This revision now requires changes to proceed.Nov 21 2019, 8:14 PM
  • using ctf_flight_size instead of tcp_compute_pipe in rack
This revision is now accepted and ready to land.Nov 22 2019, 11:29 AM
tuexen added inline comments.
share/man/man4/tcp.4
549

The documentation of rfc6675_pipe is not related to this change. I committed it separately in r355268.