Patch D51291 allows to enable IFCAP_TXCSUM and IFCAP_TXCSUM_IPV6 on a tap interface. With these enabled on the tap interface, a VM that uses this tap interface and negotiates features can also enable these features in its virtio-net interface driver and then can use checksum offloading by setting the VIRTIO_NET_HDR_F_NEEDS_CSUM flag.
This patch translates the VIRTIO_NET_HDR_F_NEEDS_CSUM flag into the transmission checksum offload flag (CSUM_IP_TCP or CSUM_IP_UDP) in the mbuf if the tap interface is a bridge member. In the edge case where the packet will not be forwarded out over another bridge interface but is destined to the bridge itself, the bridge translates the transmission checksum offload flag into the receive checksum offload flag (CSUM_DATA_VALID).
Details
It's possible to test this with a FreeBSD guest on bhyve once bhyve negotiates features with the tap interface. This comes in a follow-up patch.
Diff Detail
- Repository
- rG FreeBSD src repository
- Lint
Lint Skipped - Unit
Tests Skipped
Event Timeline
| sys/net/if_bridge.c | ||
|---|---|---|
| 2908 | should this be done before the shunt to vlan_input_p() just above? this is the path taken for a vlan(4) on a bridge(4), e.g. "bridge0.100", so those packets will also be handled locally. | |
| sys/net/if_bridge.c | ||
|---|---|---|
| 2908 | Good catch, thanks. I did a quick test with VLAN 66. On the host, I created bridge0, bridge0.66, tap0 and tap0.66. ifconfig bridge0 create ifconfig bridge0.66 create vlandev bridge0 vlan 66 ifconfig bridge0.66 inet 192.168.66.1/24 ifconfig tap0 create ifconfig tap0 txcsum txcsum6 ifconfig tap0.66 create vlandev tap0 vlan 66 ifconfig bridge0 addm tap0 I then started a FreeBSD guest with bhyve that uses tap0. On the guest, I created vtnet0.66 ifconfig vtnet0.66 create vlan 66 vlandev vtnet0 ifconfig vtnet0.66 inet 192.168.66.2/24 VLAN tagged TCP packets from the guest (generated by running ssh 192.168.100.1) left the function too early and the host dropped the packets due to the invalid checksum. I just updated the diff with my changes moved above the VLAN check. Now checksum offloading from the guest to the host works for VLAN tagged packets too. | |
| sys/net/if_bridge.c | ||
|---|---|---|
| 2908 | Sorry, I meant generated by running ssh 192.168.66.1. | |
what happens if a packet is received by the bridge interface, and is then routed via another interface - will the outgoing checksum be calculated if required?
Given that the sending endpoint uses transmission checksum offloading (txcsum) only if its outgoing interface supports it and given that the outgoing interface the bridge selects supports it as well, then yes. The idea depends on the fact that if one interface in the bridge supports txcsum, then all interfaces in the bridge do. This is partly enforced by the the bridge by removing txcsum on all interfaces if one interface without txcsum has been added. A user can enable txcsum on only a subset of all bridge interfaces afterwards but I would consider that a misconfiguration.
what i mean is if the packet is being routed (by the IP stack, not the bridge), for example if the bridge interface’s IP address is the default router for the VMs in the bridge.
in that case you’ve set the mbuf flag even though the checksum is not valid, but the checksum still needs to be calculated or at least the outgoing interface needs to be told to do the hw checksum.
is there a mechanism for that to happen?
Good point. I have not tested it but, since my code removes the txcsum flags and sets CSUM_DATA_VALID, my expectation is that the routing host will send the packet out with an incorrect checksum and the receiver will drop it.
In general, what I need is a flag that tells the host's receiving path that checksum is wrong but does not need to be checked and the host's sending path that the checksum needs to be calculated. For example, for TCP, a combination of CSUM_IP_TCP and CSUM_DATA_VALID. However, setting both flags seems not to be an option because they expect different values in csum_data (either an offset or 0xFFFF).
I'll try to find a solution for that and will update this patch then.
thanks. i like the idea behind this change, so if you can fix that i'd like to see this merged. we should have a test though, since this sort of functionality that isn't widely used to susceptible to bitrot (in this case particularly so, since i'm in the middle of rewriting bridge_input to remove the GRAB_OUR_PACKETS macro entirely).
I agree about testing. It might be hard to test this via vtnet interfaces, but I think the same problem area is with epair interfaces. And for that, it is simple to write some tests. Once a way forward has been identified, I'll work with Timo on writing some tests.