Page MenuHomeFreeBSD

fixing sysctl interface for DCTCP and types of dctcp variables
ClosedPublic

Authored by rscheff_gmx.at on Jun 7 2019, 11:56 AM.

Details

Summary
Improving the handling of sysctl variables related to DCTCP.

Making sure the fixed point integer arithmetic works for large
windows (typecast to uint64 when calculating the new alpha)

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

rscheff_gmx.at created this revision.Jun 7 2019, 11:56 AM
  • aligning sysctl with linux for alpha
rscheff_gmx.at edited the summary of this revision. (Show Details)Jun 7 2019, 12:06 PM

Note: DCTCP alpha is initially at zero - thus there will be no reduction in cwnd after the first window, only the 2nd window will have some response, if an elephant flow is tested.

chengc_netapp.com requested changes to this revision.Jun 11 2019, 9:07 PM
chengc_netapp.com added inline comments.
sys/netinet/cc/cc_dctcp.c
373 ↗(On Diff #58356)

To be consistent with the RFC, please also replace the 'F' with the 'M'.

This revision now requires changes to proceed.Jun 11 2019, 9:07 PM
rscheff_gmx.at edited the summary of this revision. (Show Details)
  • fix comment to be in-line with RFC
rscheff_gmx.at added inline comments.Jun 17 2019, 2:20 PM
sys/netinet/cc/cc_dctcp.c
373 ↗(On Diff #58356)

Done. (RFC8257, page 7)

Please update also share/man/man4/cc_dctcp.4 to describe the valid ranges of the sysctl variables and the defaults. Please fix inconsistencies like the default for dctcp_shift_g given there and in the code.

sys/netinet/cc/cc_dctcp.c
60 ↗(On Diff #58728)

Please add parentheses around the expression.

Regarding the man page and defaults:

The current default for alpha is 0, meaning during the first window a flow will be effectively unresponsive. For short flows lasting less than the IW, this default is good.

However, longer flows will take around 16 windows (RTTs) before adjusting alpha to the observed levels, during which those elephant flows are less responsive than NewReno.

Linux sets the default alpha to 1024, to react initially as strong as Reno would. Thoughts?

  • parathentis in #define
  • updating man page
rscheff_gmx.at marked an inline comment as done.Jul 27 2019, 11:01 AM
tuexen added inline comments.Jul 27 2019, 7:24 PM
share/man/man4/cc_dctcp.4
29 ↗(On Diff #60190)

This should not be changed. The patch should be based on head sources, not not on a release engineering branch.

31 ↗(On Diff #60190)

Please bump the date.

86 ↗(On Diff #60190)

I think you need to macros around dctcp_shift_g. I'll leave this to Benedict.

131 ↗(On Diff #60190)

Is this correct?

tuexen added a reviewer: bcr.Jul 27 2019, 7:25 PM

Add Benedict for man page change.

bcr added a comment.Jul 27 2019, 8:00 PM

You can check the man page using the textproc/igor port/package. Additionally, "mandoc -Tlint" also checks for certain issues.

share/man/man4/cc_dctcp.4
75 ↗(On Diff #60190)

Looks like trailing whitespace after the sentence stop.

84 ↗(On Diff #60190)

You need to make a line break after each sentence stop.

  • remove .Pp from man page
  • fixing lint issues of man update
rscheff_gmx.at marked 6 inline comments as done.Jul 28 2019, 12:48 PM

Addressed all comments so far.

Question around the default for alpha still remains; the current default renders dctcp flows very unreactive on CE marks during the initial few windows. Linux is using MAX_ALPHA_VALUE as the inital, which IMHO would be the conservative approach...

share/man/man4/cc_dctcp.4
131 ↗(On Diff #60190)

No, spurious leftover from learing fraction of nroff markup

tuexen added inline comments.Jul 28 2019, 2:30 PM
share/man/man4/cc_dctcp.4
81 ↗(On Diff #60209)

The name of the variable is shift_g, not dctcp_shift_g. (Sorry for not catching it earier.

89 ↗(On Diff #60209)

shift_g instead of dctcp_shift_g.

tuexen added inline comments.Jul 28 2019, 3:17 PM
sys/netinet/cc/cc_dctcp.c
402 ↗(On Diff #60209)

new is declared as uint32_t. So it can't be negative.
Either remove the check (and catch negative numbers provided in the sysctl command by the new > MAX_ALPHA_VALUE check) or declare it as int.

420 ↗(On Diff #60209)

Same as in dctcp_alpha_handler().

438 ↗(On Diff #60209)

Same as in dctcp_alpha_handler().

rscheff_gmx.at marked an inline comment as done.
  • cleaned up sysctl handlers, fixed sysctl name in man
rscheff_gmx.at marked 5 inline comments as done.Jul 28 2019, 4:32 PM
rscheff_gmx.at added inline comments.
sys/netinet/cc/cc_dctcp.c
402 ↗(On Diff #60209)

Checked the SYSCTL utility functions; apparently, with the CTLTYPE_UINT, negative input strings are discarded there, so no need to check here again. The handler is only called once these basic checks succeed. Thus removing the check for negatives here.

Addressed all comments so far.
Question around the default for alpha still remains; the current default renders dctcp flows very unreactive on CE marks during the initial few windows. Linux is using MAX_ALPHA_VALUE as the inital, which IMHO would be the conservative approach...

If you think using MAX_ALPHA_VALUE is better, then go ahead and change it. Also two (final) nits in the man-page.

share/man/man4/cc_dctcp.4
31 ↗(On Diff #60212)

We have 2019, not 2015 anymore.

tuexen added inline comments.Jul 28 2019, 6:15 PM
share/man/man4/cc_dctcp.4
73 ↗(On Diff #60212)

This should read

.Bl -tag -width ".Va slowstart"
rscheff_gmx.at marked an inline comment as done.
  • fixing two nits in the man page, and setting dctcp alpha default to conservative 1024
  • bump dctcp reference to RFC8257 in the man page
This revision was not accepted when it landed; it landed in state Needs Review.Jul 29 2019, 8:50 AM
This revision was automatically updated to reflect the committed changes.
head/share/man/man4/cc_dctcp.4
92

From the code on line 263 and 264 below, looks this "slowstart" flag only take effect after the initial slow start. So how about change to "... after initial slow start."?

head/sys/netinet/cc/cc_dctcp.c
76

Consider the type of num_cong_events to be "uint64_t", for long flows?

head/share/man/man4/cc_dctcp.4
92

Actually, num_cong_events is reset also in after_idle, so the man description to this tunable seems correct. Also, as num_cong_events is reset frequently, with only the case of that variable being zero being special-cased, I don't think it actually needs to change.

As for a statistics counter - that should probably never reset on after_idle, rto or other such events, and would need to be separate anyway. If we want to do this, we could possibly reduce the cc-specific data structure by using shorter integer types in a few places.

head/share/man/man4/cc_dctcp.4
92

Agree. But look at cubic's num_cong_events, it does not reset, which looks confusing for its purpose between the two congestion controls. I think we can keep num_cong_events as a uint32_t in dctcp now, and look for a future change to unify the purpose of num_cong_events as a statistics counter across all the congestion controls.