Page MenuHomeFreeBSD

Restrict cwnd growth on app-limited flows
ClosedPublic

Authored by rscheff_gmx.at on Sep 26 2019, 10:00 AM.

Details

Summary

As long as no congestion event has happened, CWnd can grow "unbounded".
This can happen when the sender and receiver can operate at full wire
speed at all times. In this scenario, cwnd is not relevant to regulate
the data flow however.

The more interesting and realistic scenario is when an application
starts off transmitting at a limited bandwidth at first, to transition
into bulk transfer mode later during the evolution of the session.

In this case, cwnd may have grown to a significant size during
slow-start (uninterrupted due to no congestion event while only little
bandwidth was actually excercised). Once the application transistions
into the bulk transfer phase, a huge burst of (line-rate) data can
be transmitted by the current default BSD TCP stack, very likely to
run into self-inflicted significant packet loss.

This change restricts the growth of cwnd to no more than twice of the
current flightsize. This is along the guidance presented in New
Congestion Window Validation (NewCWV) RFC7661, but only addresses one
specific corner case, where transmission bursts may happen.

Test Plan

uperf profile to simulate an initially app-limited tcp flow, which
eventually starts to send bulk data.

iperf transfers on back-to-back links would also show the unlimited
cwnd growth (but no detrimental effects, as the peak sending rate
can not exceed the senders line rate anyway).

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

The uperf profile to simulate an application transitioning from app-limited into bulk transfer mode.

Evolution of cwnd - 2nd flow starts at +12sec before this patch, using cc_cubic

Evolution of cwnd - 2nd flow starts at +12sec, after this pacht, using cc_cubic

from the transport call:

o) make similar change to RACK stack also
o) add a sysctl to disable any code change related NewCVW

  • adding new sysctl to control NewCWV related behavior, disabled by default. extend man page
chengc_netapp.com requested changes to this revision.Oct 25 2019, 2:47 PM
chengc_netapp.com added inline comments.
share/man/man4/tcp.4
545 ↗(On Diff #63249)

Do you mean "self-inflicted"?

sys/netinet/tcp_input.c
305–307 ↗(On Diff #63249)

Can save some parentheses like this:

if ((!V_tcp_do_newcwv && tp->snd_cwnd <= tp->snd_wnd) ||
    (V_tcp_do_newcwv && tp->snd_cwnd <= tp->snd_wnd &&
     tp->snd_cwnd < tcp_compute_pipe(tp) * 2))
This revision now requires changes to proceed.Oct 25 2019, 2:47 PM
  • minor typo in man
  • add sysctl variable properly
  • bump man page timestamp

Should be ready to land now.

sys/netinet/tcp_input.c
305–307 ↗(On Diff #63249)

leaving the brackets aroung the arithmetic evaluations to be clear about evaluation ordering, while removing other superfluous brackets here.

chengc_netapp.com added inline comments.
sys/netinet/tcp_input.c
307 ↗(On Diff #64422)

Need one more space of indent on the third line to indicate the third line is part of the second condition:

(V_tcp_do_newcwv && (tp->snd_cwnd <= tp->snd_wnd) &&
(tp->snd_cwnd < (tcp_compute_pipe(tp) * 2)))) <== add one more space

  • add single 3rd level indentation space to long if clause
sys/netinet/tcp_input.c
307 ↗(On Diff #64422)

FBSD style guide states, to use only tab (8 space) first level indentation, and to break up any lengthy single line statement with 4 space 2nd level indentation.

However, there is precedent to this, e.g. tcp_input.c:1635, 1835, 1853, 2371 and more (and couple ~1/3 few instances, where 2nd level indentation was done with one superfluous space).

Since IMHO this does improve readability, will add this space.

rrs requested changes to this revision.Thu, Nov 21, 8:14 PM
rrs added a subscriber: rrs.

You cannot use tcp_compute_pipe() with rack. It does not use
the same variables as the default stack. Instead you must use the
ctf_flight_size() function to get whats in flight.

Please see other instances in rack for proper use of the function.

This revision now requires changes to proceed.Thu, Nov 21, 8:14 PM
  • using ctf_flight_size instead of tcp_compute_pipe in rack
rrs accepted this revision.Fri, Nov 22, 11:29 AM
This revision is now accepted and ready to land.Fri, Nov 22, 11:29 AM
tuexen accepted this revision.Sun, Dec 1, 7:22 PM
tuexen added inline comments.
share/man/man4/tcp.4
549 ↗(On Diff #64720)

The documentation of rfc6675_pipe is not related to this change. I committed it separately in r355268.

This revision was automatically updated to reflect the committed changes.