Page MenuHomeFreeBSD

hyperv/hn: Implement LRO
ClosedPublic

Authored by sepherosa_gmail.com on Jan 8 2016, 2:12 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Dec 21, 5:41 PM
Unknown Object (File)
Mon, Dec 9, 9:42 AM
Unknown Object (File)
Nov 16 2024, 12:50 PM
Unknown Object (File)
Oct 30 2024, 6:30 AM
Unknown Object (File)
Oct 7 2024, 12:23 AM
Unknown Object (File)
Oct 7 2024, 12:23 AM
Unknown Object (File)
Oct 7 2024, 12:23 AM
Unknown Object (File)
Oct 7 2024, 12:23 AM

Details

Summary
  • Implement the LRO using tcp_lro APIs, and LRO is enabled by default
  • Add several stats sysctl nodes
  • Check IP/TCP length before sending the packet to tcp_lro_rx(), if host does not provide RX csum information (*); and add an option through sysctl to always trust host TCP segment csum checks (default is off).
  • Add sysctl to control the LRO entry depth. This depends on later tcp_lro patch, thus it is disabled by default. It is used to avoid holding too much TCP segments in driver. Limiting the LRO entry depth helps a lot in a one/two streams RX test.

This one 3x the RX performance on my local test (3Gbps -> 10Gbps), and ~2x the RX performance over a directly connected 40Ge network (5Gbps -> 9Gbps).

Reviewed by: Hongjiang Zhang <honzhan microsoft com>, Dexuan Cui <decui microsoft com>, Jun Su <junsu microsoft com>
Tested by: me (local), Hongjiang Zhang <honzhan microsoft com> (directly connected 40Ge)
Sponsored by: Microsoft OSTC

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

sepherosa_gmail.com retitled this revision from to hyperv/hn: Implement LRO.
sepherosa_gmail.com updated this object.
sepherosa_gmail.com edited the test plan for this revision. (Show Details)
  1. When the statistic data was reset or cleared? I found if I enable LRO and run some test, and then disable LRO, the statistic data are still there.
  2. Please follow the style (9) for all "return" statement. e.g. "return IPPROTO_DONE;" should be changed to "return (IPPROTO_DONE);"
  3. For TUNABLE_INT("dev.hn.trust_hosttcp", &hn_trust_hosttcp); "dev.hn.trust_hosttcp" is not a generic option for network, instead it only makes sense for hyperv network driver, so could you please rename it to be "dev.hyperv.trust_hosttcp"? "hn" makes people think it is "hn0".
  1. When the statistic data was reset or cleared? I found if I enable LRO and run some test, and then disable LRO, the statistic data are still there.

Heh, I could make it read/write, so user could reset them manually through sysctl; relying on IFCAP_LRO on/off is not a good idea.

  1. Please follow the style (9) for all "return" statement. e.g. "return IPPROTO_DONE;" should be changed to "return (IPPROTO_DONE);"

return xxx; is widely used, you could check other files in src/. And I don't think there is an enforment on using parenthesis for return.

  1. For TUNABLE_INT("dev.hn.trust_hosttcp", &hn_trust_hosttcp); "dev.hn.trust_hosttcp" is not a generic option for network, instead it only makes sense for hyperv network driver, so could you please rename it to be "dev.hyperv.trust_hosttcp"? "hn" makes people think it is "hn0".

It's only hn(4) related node. dev.hn.trust_hosttcp closely follows device sysctl tree layout (dev.hn.X.trust_hosttcp). Chang it to other pattern would be confusing.

sepherosa_gmail.com edited edge metadata.

Make LRO flushed and queued stats sysctl nodes read/write, so that user could reset the stats and start over.

Looks good to me overall, consider this as a formal approval if there is no objection from Adrian by Monday.

Could you please take a look at my inline comments and address them if they are reasonable?

sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c
186 ↗(On Diff #12034)

Will it be sensible to turn this into SYSCTL_INT with CTLFLAG_RDTUN, so users can check its value?

410 ↗(On Diff #12034)

Looks like this is never defined, but a quick glance suggests it won't hurt to have these code? Can these be unifdef -D'ed (unconditionally compiled in)?

sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c
186 ↗(On Diff #12034)

I think you could get it though kenv(1). But it does not hurt to expose this tunable as a read-only sysctl. I will add it.

410 ↗(On Diff #12034)

It relies on a tcp_lro.[ch] change I put up in another review (https://reviews.freebsd.org/D4825, mainly added a lro_hiwat field into lro_ctrl), so I turn this off by default.

This macro may also ease the work to MFC this patch to 10-stable.

Add devclass sysctl for global trust_hosttcp setting. Suggested-by: delphij

I am going to commit this tomorrow, if no objection comes.

sys/dev/hyperv/netvsc/hv_netvsc_drv_freebsd.c
186 ↗(On Diff #12098)

It turns out that static sysctl could not be used: dev.hn is dynamically created by bus code. The latest patch add a dev.hn.trust_hosttcp on hn0 attach path.

This revision was automatically updated to reflect the committed changes.