Page MenuHomeFreeBSD

bridge: Add checksum offloading
AcceptedPublic

Authored by timo.voelker_fh-muenster.de on Jan 29 2026, 8:40 PM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Mar 11, 11:56 AM
Unknown Object (File)
Mon, Mar 9, 6:58 PM
Unknown Object (File)
Thu, Mar 5, 12:13 AM
Unknown Object (File)
Tue, Feb 24, 6:19 PM
Unknown Object (File)
Mon, Feb 23, 9:55 PM
Unknown Object (File)
Mon, Feb 23, 9:55 PM
Unknown Object (File)
Mon, Feb 23, 12:48 PM
Unknown Object (File)
Mon, Feb 16, 2:49 AM

Details

Summary

Add transmission checksum offloading capabilities (IFCAP_TXCSUM, IFCAP_TXCSUM_IPV6, IFCAP_VLAN_HWCSUM) to the bridge and remove synchronization of IFCAP_TXCSUM and IFCAP_TXCSUM_IPV6 between member interfaces. For an outgoing packet with the checksum offloading flag set, the bridge now checks if the designated outgoing interface supports checksum offloading for the corresponding protocol. If not, it computes and inserts the checksum before passing the packet to the outgoing interface.

I see two use cases.

  1. If the bridge is the source interface, the network stack won't compute the checksum in software anymore and, thus, can make use of the checksum offloading capability of the physical outgoing interface.
  2. A virtual interface like epair can use checksum offloading more reliably. If the other epair end is in a bridge and the outgoing interface does not support checksum offloading for the chosen protocol, the bridge will take care of it.
Test Plan

If <IF> is your main network interface and <IP> the configured IP address on that interface. Add a bridge like this:

ifconfig bridge0 create
ifconfig <IF> inet <IP> -alias
ifconfig bridge0 addm <IF>
ifconfig bridge0 inet <IP>

The source interface of outgoing packet for the FreeBSD IP stack is now bridge0 (and not <IF> anymore).

Before this patch:
The FreeBSD IP stack stops using checksum offloading and computes checksums in software.

After this patch:
The FreeBSD IP stack keeps using checksum offloading.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

sys/net/if_bridge.c
2418

style.9: return (NULL);

2442

style.9: return (m);

2452

style.9: return (m);

2460

style.9: return (NULL);

2496

style.9: return (m);

2503

style.9: return (m);

2524

style.9: return (m);

Might be off-topic. I have ever had an idea, that introducing SOFT checksum offload to all soft interfaces. The upper stack will see this and hand over the calculating of the checksums to the interfaces. The soft interfaces may calculate the checksums when absolutely necessary, or let the edge physical interfaces do the stuff when the packets are leaving the host. Then we can benefits the hardware checksum, or we have better icache locality ( batch processing the mbufs ) when doing soft checksum.

What do you think ?

Might be off-topic. I have ever had an idea, that introducing SOFT checksum offload to all soft interfaces. The upper stack will see this and hand over the calculating of the checksums to the interfaces. The soft interfaces may calculate the checksums when absolutely necessary, or let the edge physical interfaces do the stuff when the packets are leaving the host. Then we can benefits the hardware checksum, or we have better icache locality ( batch processing the mbufs ) when doing soft checksum.

What do you think ?

Does this sound similar to https://wiki.freebsd.org/Networking/ChecksumOffloading ?

I do remember and thanks for the link! This is something that needs to be rebased and used for that.

Might be off-topic. I have ever had an idea, that introducing SOFT checksum offload to all soft interfaces. The upper stack will see this and hand over the calculating of the checksums to the interfaces. The soft interfaces may calculate the checksums when absolutely necessary, or let the edge physical interfaces do the stuff when the packets are leaving the host. Then we can benefits the hardware checksum, or we have better icache locality ( batch processing the mbufs ) when doing soft checksum.

What do you think ?

Does this sound similar to https://wiki.freebsd.org/Networking/ChecksumOffloading ?

Ah, I did not know the plan exists. Yes, my idea is almost the same with that.

Allow enabling/disabling the new capabilities: IFCAP_TXCSUM, IFCAP_TXCSUM_IPV6, and IFCAP_VLAN_HWCSUM.

Seems like progress to me. Would be good to get a pass from @ivy who did a lot in bridge lately. It would be a good idea to do present quick A/B test to make sure there are no unexpected perf issues. I don't think there is any reason to delay for GSO (unrelated) or a rework flippening offloads which can progress at its own pace.

sys/net/if_bridge.c
2420

something like this should eventually move to shared net code

This revision is now accepted and ready to land.Wed, Feb 18, 10:07 PM
sys/net/if_bridge.c
2436

i'm not convinced this is correct. i believe bridge (via ethersubr) will accept a tagged packet with only a single tag but with the 802.1ad QinQ Ethernet protocol type, meaning we can't use the ethertype to determine the offset but instead have to check the nested frame.

i may be wrong here, but i'd at least like to double-check this.

pouria added inline comments.
sys/net/if_bridge.c
2436

i'm not convinced this is correct. i believe bridge (via ethersubr) will accept a tagged packet with only a single tag but with the 802.1ad QinQ Ethernet protocol type, meaning we can't use the ethertype to determine the offset but instead have to check the nested frame.

Yes, this is NOT correct.
In fact, a separate ether type for 802.1ad exists to allow customers of a network provider to use the provider's L2 network to transmit both tagged and non-tagged packets. (that's why we use tagged native on vendors)
otherwise, it would be meaningless to use anything other than 0x8100.
So, in summary, @ivy is right: the inner VLAN tag may or may not exist at all, thus, offset is dynamic.
Unfortunately, in the real world, it is possible to have multiple (more than two) VLAN tags on both 802.1Q and 802.1ad of a receiving packet.

sys/net/if_bridge.c
2436

Thanks for your comments! Does this table summarizes it correctly?

typenext typeheader length
other than ETHERTYPE_VLAN or ETHERTYPE_QINQnot present14 bytes
ETHERTYPE_VLANirrelevant18 bytes
ETHERTYPE_QINQother than ETHERTYPE_VLAN18 bytes
ETHERTYPE_QINQETHERTYPE_VLAN22 bytes
sys/net/if_bridge.c
2436

this is for locally-originated packets, right? so i think the question is, what can we actually generate here? can i create a vlan1.2.3.4.5 interface and send frames with 1 .1ad tag and 4 .1q tags?

is the actual offset of the IP header already stored somewhere for local packets?

sys/net/if_bridge.c
2436

this is for locally-originated packets, right?

Kind of.

so i think the question is, what can we actually generate here?

The problem is, "we" is more than FreeBSD. A locally-originated packet can be a packet generated by any operating system in a VM that is running on a FreeBSD host.

can i create a vlan1.2.3.4.5 interface and send frames with 1 .1ad tag and 4 .1q tags?

I haven't found one line where FreeBSD sets the type ETHERTYPE_QINQ. So, I assume, FreeBSD cannot generate Ethernet frames with a .1ad (QinQ) tag. But, I just tested and was surprised to see that FreeBSD creates a VLAN interface on top of another VLAN interface.

ifconfig igb3 up
ifconfig igb3.1 create vlan 1 vlandev igb3 up
ifconfig igb3.1.2 create vlan 2 vlandev igb3.1 up

Sending packets via igb3.1.2 generates Ethernet frames with two .1q tags. Does the Ethernet specification allow this? Since FreeBSD can generate it, I guess, it should be supported.

is the actual offset of the IP header already stored somewhere for local packets?

Drew Gallatin pointed me to the l2hlen field in the mbuf packet header. This would be perfect but, currently, FreeBSD does not set it. While generating the packet, it is probably easy to set it. However, if, for example, Linux generated the packet in a VM and sends the packet to the FreeBSD host, it's probably less easy to set l2hlen when FreeBSD creates the mbuf for an already generated packet.

sys/net/if_bridge.c
2436

A locally-originated packet can be a packet generated by any operating system in a VM that is running on a FreeBSD host.

okay, that does make this more difficult.

I haven't found one line where FreeBSD sets the type ETHERTYPE_QINQ.

ifconfig vlan1 vlanproto 802.1ad should generate such frames. you can also tell bridge to do this using ifconfig bridge0 ifvlanproto ix0 802.1ad, which applies to cases where bridge is encapsulating the frames itself.

Sending packets via igb3.1.2 generates Ethernet frames with two .1q tags. Does the Ethernet specification allow this?

my recollection is that this might not be permitted by spec, but almost everyone allows it. i may be wrong there on the spec, but it definitely happens in reality either way, so it's something we probably need to handle.

but i wonder: how does real hardware do this? if a physical interface supports VLAN checksums, will the hardware actually do the checksum for an outgoing frame with 3+ nested tags?

sys/net/if_bridge.c
2436

A locally-originated packet can be a packet generated by any operating system in a VM that is running on a FreeBSD host.

okay, that does make this more difficult.

I haven't found one line where FreeBSD sets the type ETHERTYPE_QINQ.

ifconfig vlan1 vlanproto 802.1ad should generate such frames. you can also tell bridge to do this using ifconfig bridge0 ifvlanproto ix0 802.1ad, which applies to cases where bridge is encapsulating the frames itself.

You are right. It's that simple to add a .1ad tag in FreeBSD and it can be added in any combination. Good to know. Thanks!

I guess, if I need to parse the packet, I have to loop over the ether types and add 4 bytes each time I see ETHERTYPE_VLAN or ETHERTYPE_QINQ until I see some other type.

Sending packets via igb3.1.2 generates Ethernet frames with two .1q tags. Does the Ethernet specification allow this?

my recollection is that this might not be permitted by spec, but almost everyone allows it. i may be wrong there on the spec, but it definitely happens in reality either way, so it's something we probably need to handle.

Thanks for sharing your experience. Very helpful.

but i wonder: how does real hardware do this? if a physical interface supports VLAN checksums, will the hardware actually do the checksum for an outgoing frame with 3+ nested tags?

I don't know what some network hardware can do, but I believe it's not possible to use transmission checksum offloading with nested VLAN tags in FreeBSD. On my test machine, my igb3 interface has TXCSUM meaning it supports checksum offloading . It also has VLAN_HWCSUM, which means that it supports checksum offloading even for tagged packets. This is the reason, why my igb3.1 interface also has TXCSUM. But it does not have VLAN_HWCSUM (I don't think a VLAN interface can have VLAN_HWCSUM). Thus, another VLAN interface on top of it - like my igb3.1.2 interface - does not have TXCSUM.