Page MenuHomeFreeBSD

tcp/lro: Allow network drivers to set the limit for TCP ACK/data segment aggregation limit
ClosedPublic

Authored by sepherosa_gmail.com on Feb 4 2016, 8:44 AM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Oct 30, 6:08 PM
Unknown Object (File)
Oct 20 2024, 1:07 AM
Unknown Object (File)
Oct 20 2024, 1:07 AM
Unknown Object (File)
Oct 20 2024, 1:07 AM
Unknown Object (File)
Oct 20 2024, 12:51 AM
Unknown Object (File)
Oct 20 2024, 12:49 AM
Unknown Object (File)
Oct 6 2024, 5:39 AM
Unknown Object (File)
Oct 2 2024, 2:17 AM

Details

Summary

It's ACK aggregation limit is append count based, while the TCP data segment aggregation is length based. Unless the network driver sets these two limits, its an NO-OP.

For hn(4):

  • Set TCP ACK append limit to 1, i.e. aggregate 2 ACKs at most. Aggregate anything more than 2 hurts TCP sending performance in hyperv. This significantly improves the TCP sending performance when the number of concurrent connetion is low (2~8). And greatly stabilize the TCP sending performance in other cases.
  • Set TCP data segments length limit to 37500. Without this limitation, hn(4) could aggregate ~45 TCP data segments for each connection (even at 64 or more connections) before dispatching them to socket code; large aggregation slows down ACK sending and eventually hurts/destabilizes TCP reception performance. This setting stabilizes and improves TCP reception performance for >4 concurrent connections significantly.

Make them sysctls so they could be adjusted.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

sepherosa_gmail.com retitled this revision from to tcp/lro: Allow network drivers to set the limit for TCP ACK/data segment aggregation limit.
sepherosa_gmail.com updated this object.
sepherosa_gmail.com edited the test plan for this revision. (Show Details)

It might be nice to make these general tunables that could be done centrally and apply to all drivers, but that's probably outside the scope of the review.

sys/netinet/tcp_lro.c
655 ↗(On Diff #13282)

Can you just initialize ack_append_limit to the max value for whatever type it is and eliminate the check for a 0 ack_append_limit? That would eliminate one clause from this conditional.

682 ↗(On Diff #13282)

Rather than adding more clauses to this condition, how would to feel about setting an append limit in bytes, and replacing the hard-coded 65535 with this new limit? The default lro init would initialize the new limit to 65535. And hn(4) would initialize it in terms of multiples of its MTU.

sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c
459 ↗(On Diff #13282)

this should be a separate commit

sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c
459 ↗(On Diff #13282)

OK, I will split it out.

I will adjust the patch accordingly.

sys/netinet/tcp_lro.c
655 ↗(On Diff #13282)

Sure :)

682 ↗(On Diff #13282)

Sounds fine to me. I did the byte limit before (https://reviews.freebsd.org/D4825). But it turns out the ACKs need seperate limit (append count based). To make them consistent, I changed the original patch to use append count too.

sepherosa_gmail.com updated this object.
sepherosa_gmail.com edited edge metadata.

Address gallatin and adrian's concern.

gallatin edited edge metadata.

Thanks for addressing my concerns.. Does anybody else want to comment?

adrian edited edge metadata.

Nice! Thanks for all this work!

sys/netinet/tcp_lro.h
94 ↗(On Diff #13282)

Might be worth set this limit to unsigned instead of unsigned short. Technically we can LRO more than 64KBytes worth of data!

sys/netinet/tcp_lro.h
94 ↗(On Diff #13282)

My intention here is too keep the size of lro_ctrl unchanged on amd64 (I think there is an implicit 4 bytes padding after lro_mbuf_max :). But I am fine to change them into unsigned int.

Does anyone know any drawbacks to change these two fields into unsigned int? If not, I would change them into unsigned int after Chinese New Year :)

The size of lro_ctrl already changed when the statistics was made 64-bit. Just remember to bump the FreeBSD_version. Might not be possible to MFC.

The size of lro_ctrl already changed when the statistics was made 64-bit. Just remember to bump the FreeBSD_version. Might not be possible to MFC.

OK, let's wait for others inputs (I will be away next week). If no objection comes, I will change the limits to unsigned int.

sepherosa_gmail.com edited edge metadata.

Expand length limitation to 32 bits. Suggested by hselasky.

adrian edited edge metadata.
This revision was automatically updated to reflect the committed changes.