Improving the handling of sysctl variables related to DCTCP. Making sure the fixed point integer arithmetic works for large windows (typecast to uint64 when calculating the new alpha)
Details
- Reviewers
tuexen hiren gnn cc bcr - Group Reviewers
transport manpages - Commits
- rS351994: MFC r350403:
rS350403: * Improve input validation of sysctl parameters for DCTPC.
Diff Detail
- Lint
Lint Passed - Unit
No Test Coverage - Build Status
Buildable 24747 Build 23508: arc lint + arc unit
Event Timeline
Note: DCTCP alpha is initially at zero - thus there will be no reduction in cwnd after the first window, only the 2nd window will have some response, if an elephant flow is tested.
sys/netinet/cc/cc_dctcp.c | ||
---|---|---|
373 | To be consistent with the RFC, please also replace the 'F' with the 'M'. |
sys/netinet/cc/cc_dctcp.c | ||
---|---|---|
373 | Done. (RFC8257, page 7) |
Please update also share/man/man4/cc_dctcp.4 to describe the valid ranges of the sysctl variables and the defaults. Please fix inconsistencies like the default for dctcp_shift_g given there and in the code.
sys/netinet/cc/cc_dctcp.c | ||
---|---|---|
60 | Please add parentheses around the expression. |
Regarding the man page and defaults:
The current default for alpha is 0, meaning during the first window a flow will be effectively unresponsive. For short flows lasting less than the IW, this default is good.
However, longer flows will take around 16 windows (RTTs) before adjusting alpha to the observed levels, during which those elephant flows are less responsive than NewReno.
Linux sets the default alpha to 1024, to react initially as strong as Reno would. Thoughts?
share/man/man4/cc_dctcp.4 | ||
---|---|---|
29 ↗ | (On Diff #60190) | This should not be changed. The patch should be based on head sources, not not on a release engineering branch. |
31 ↗ | (On Diff #60190) | Please bump the date. |
86 ↗ | (On Diff #60190) | I think you need to macros around dctcp_shift_g. I'll leave this to Benedict. |
131 ↗ | (On Diff #60190) | Is this correct? |
You can check the man page using the textproc/igor port/package. Additionally, "mandoc -Tlint" also checks for certain issues.
share/man/man4/cc_dctcp.4 | ||
---|---|---|
75 ↗ | (On Diff #60190) | Looks like trailing whitespace after the sentence stop. |
84 ↗ | (On Diff #60190) | You need to make a line break after each sentence stop. |
Addressed all comments so far.
Question around the default for alpha still remains; the current default renders dctcp flows very unreactive on CE marks during the initial few windows. Linux is using MAX_ALPHA_VALUE as the inital, which IMHO would be the conservative approach...
share/man/man4/cc_dctcp.4 | ||
---|---|---|
131 ↗ | (On Diff #60190) | No, spurious leftover from learing fraction of nroff markup |
sys/netinet/cc/cc_dctcp.c | ||
---|---|---|
402 | Checked the SYSCTL utility functions; apparently, with the CTLTYPE_UINT, negative input strings are discarded there, so no need to check here again. The handler is only called once these basic checks succeed. Thus removing the check for negatives here. |
If you think using MAX_ALPHA_VALUE is better, then go ahead and change it. Also two (final) nits in the man-page.
share/man/man4/cc_dctcp.4 | ||
---|---|---|
31 ↗ | (On Diff #60212) | We have 2019, not 2015 anymore. |
share/man/man4/cc_dctcp.4 | ||
---|---|---|
73 ↗ | (On Diff #60212) | This should read .Bl -tag -width ".Va slowstart" |
- fixing two nits in the man page, and setting dctcp alpha default to conservative 1024
head/share/man/man4/cc_dctcp.4 | ||
---|---|---|
92 ↗ | (On Diff #60227) | From the code on line 263 and 264 below, looks this "slowstart" flag only take effect after the initial slow start. So how about change to "... after initial slow start."? |
head/sys/netinet/cc/cc_dctcp.c | ||
76 ↗ | (On Diff #60227) | Consider the type of num_cong_events to be "uint64_t", for long flows? |
head/share/man/man4/cc_dctcp.4 | ||
---|---|---|
92 ↗ | (On Diff #60227) | Actually, num_cong_events is reset also in after_idle, so the man description to this tunable seems correct. Also, as num_cong_events is reset frequently, with only the case of that variable being zero being special-cased, I don't think it actually needs to change. As for a statistics counter - that should probably never reset on after_idle, rto or other such events, and would need to be separate anyway. If we want to do this, we could possibly reduce the cc-specific data structure by using shorter integer types in a few places. |
head/share/man/man4/cc_dctcp.4 | ||
---|---|---|
92 ↗ | (On Diff #60227) | Agree. But look at cubic's num_cong_events, it does not reset, which looks confusing for its purpose between the two congestion controls. I think we can keep num_cong_events as a uint32_t in dctcp now, and look for a future change to unify the purpose of num_cong_events as a statistics counter across all the congestion controls. |