Page MenuHomeFreeBSD

net: allow fast-forwarding TSO packets
Needs RevisionPublic

Authored by royger on May 27 2016, 4:15 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Apr 20, 4:12 PM
Unknown Object (File)
Fri, Apr 19, 3:24 AM
Unknown Object (File)
Thu, Apr 18, 9:22 PM
Unknown Object (File)
Tue, Apr 16, 5:22 PM
Unknown Object (File)
Tue, Apr 16, 4:57 PM
Unknown Object (File)
Thu, Apr 11, 3:39 PM
Unknown Object (File)
Sat, Apr 6, 7:23 PM
Unknown Object (File)
Dec 9 2023, 5:04 PM

Details

Reviewers
gnn
gallatin
Group Reviewers
network
transport
Summary

This is important when running on a virtualized environment, where packets
might come from another VM on the same host and the segmentation might not
have happened yet because the packet has not reached a physical network
interface.

Try to handle those packets by directly forwarding them to the outbound
network interface if it also supports TSO and the input and output options
match.

Sponsored by: Citrix Systems R&D

Test Plan

FreeBSD DomU PVHVM guests cannot 'route' traffic for other Xen PV guests on same Dom0 Host.
I am linking to the PR that this is suppose to be part of the fix for.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 3993
Build 4036: arc lint + arc unit

Event Timeline

royger retitled this revision from to net: allow fast-forwarding TSO packets.
royger updated this object.
royger edited the test plan for this revision. (Show Details)

Add more checks to make sure the output interface supports the TSO chain.

ae added inline comments.
sys/netinet/ip_fastfwd.c
450

It isn't safe. There is no guarantee that mbuf's data will be contiguous more than sizeof(struct ip).

sys/netinet/ip_fastfwd.c
450

What's the proper way to get the TCP payload length then?

sys/netinet/ip_fastfwd.c
450

The easiest way is m_copydata() with proper offset.

sys/netinet/ip_fastfwd.c
450

Thanks! Will update the patch shortly unless there are other comments.

Use m_copydata instead of mtodo in order to get the TCP header. Using mtodo
is not correct because it's possible that the TCP header is not in the first
mbuf.

Ping? I have this and a couple more revisions (D6656 and D6611) that I would like to get committed before the code freeze (although I consider them bug-fixes, so they could also go in during the freeze).

Thanks, Roger.

gnn added a reviewer: gnn.
gnn added a subscriber: gnn.
gnn added inline comments.
sys/netinet/ip_fastfwd.c
448

Minor nit but I'd prefer an extra set of braces around the first clause, before the &&

This revision is now accepted and ready to land.Jun 13 2016, 11:40 PM
sys/netinet/ip_fastfwd.c
456

Can this fancy concatenation of 8 conditions be an inline function?

sys/netinet/ip_fastfwd.c
456

Why, it isnt used in more than one place?

sys/netinet/ip_fastfwd.c
456

No, I hope it will improve readability. It will also be easier to attach some comments.

What device is setting up the TSO flags on incoming frames? Shouldn't it be setting up the LRO flags, and shouldn't we be checking those and converting them to TSO in the forwarding path?

The TSO conventions for checksum offload is that we store the TCP pseudohdr checksum in the th->th_sum. This is done as an artifact of TCP checksum offload, however some NICs require this. This change will break those NICs unless you setup the checksum properly by

a) replacing m_pkthdr.csum_data with the offset of the TCP header's checksum field
b) replacing the full TCP checksum with the pseudohdr sum.

See, for example, sys/dev/mxge/if_mxge.c:mxge_encap_tso() as an example of how to prepare the checksum fields to be the same as what the TCP stack sends down to drivers.

This revision now requires changes to proceed.Aug 15 2020, 7:42 PM
sys/netinet/ip_fastfwd.c
450

Is this really safe? sizeof(struct ip) == 20 bytes, but IPv4 header can be up to 60 bytes with options. I think you should use ip.ip_hl field as offset.

sys/netinet/ip_fastfwd.c
450

Or check V_ip_doopts != 0, as the code above rejects packets with options if V_ip_doopts is non-zero.