It seems, there are no reason to prevent setting MTU more than 1500 bytes.
MTU greater than 1500 gives a significant increase in throughput.
Details
iperf3 tests between two machines using vxlan over 10Gbit network with various MTU.
Test 1. vxlan MTU -1500, physical network MTU - 9000
# iperf3 -c 192.168.248.1 Connecting to host 192.168.248.1, port 5201 [ 5] local 192.168.248.2 port 1050 connected to 192.168.248.1 port 5201 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 175 MBytes 1.46 Gbits/sec 0 1.27 MBytes [ 5] 1.00-2.01 sec 194 MBytes 1.63 Gbits/sec 0 1.27 MBytes [ 5] 2.01-3.00 sec 195 MBytes 1.64 Gbits/sec 0 1.27 MBytes [ 5] 3.00-4.01 sec 196 MBytes 1.64 Gbits/sec 0 1.27 MBytes [ 5] 4.01-5.00 sec 195 MBytes 1.64 Gbits/sec 0 1.27 MBytes [ 5] 5.00-6.00 sec 195 MBytes 1.63 Gbits/sec 0 1.27 MBytes [ 5] 6.00-7.00 sec 196 MBytes 1.64 Gbits/sec 0 1.27 MBytes [ 5] 7.00-8.00 sec 194 MBytes 1.64 Gbits/sec 0 1.27 MBytes [ 5] 8.00-9.00 sec 191 MBytes 1.60 Gbits/sec 493 968 KBytes [ 5] 9.00-10.00 sec 193 MBytes 1.62 Gbits/sec 0 1.15 MBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 1.88 GBytes 1.61 Gbits/sec 493 sender [ 5] 0.00-10.00 sec 1.88 GBytes 1.61 Gbits/sec receiver iperf Done.
Test 2. vxlan MTU -8900, physical network MTU - 9000
# iperf3 -c 192.168.248.1 Connecting to host 192.168.248.1, port 5201 [ 5] local 192.168.248.2 port 1052 connected to 192.168.248.1 port 5201 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 585 MBytes 4.90 Gbits/sec 0 1.28 MBytes [ 5] 1.00-2.00 sec 655 MBytes 5.50 Gbits/sec 0 1.28 MBytes [ 5] 2.00-3.00 sec 656 MBytes 5.50 Gbits/sec 0 1.28 MBytes [ 5] 3.00-4.00 sec 655 MBytes 5.50 Gbits/sec 0 1.28 MBytes [ 5] 4.00-5.00 sec 656 MBytes 5.50 Gbits/sec 0 1.28 MBytes [ 5] 5.00-6.00 sec 655 MBytes 5.50 Gbits/sec 0 1.28 MBytes [ 5] 6.00-7.00 sec 656 MBytes 5.50 Gbits/sec 0 1.28 MBytes [ 5] 7.00-8.00 sec 658 MBytes 5.51 Gbits/sec 0 1.28 MBytes [ 5] 8.00-9.00 sec 655 MBytes 5.50 Gbits/sec 0 1.28 MBytes [ 5] 9.00-10.00 sec 655 MBytes 5.50 Gbits/sec 0 1.28 MBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 6.33 GBytes 5.44 Gbits/sec 0 sender [ 5] 0.00-10.00 sec 6.33 GBytes 5.44 Gbits/sec receiver iperf Done.
Test 3. vxlan MTU -8900, physical network MTU - 1500
# iperf3 -c 192.168.248.1 Connecting to host 192.168.248.1, port 5201 [ 5] local 192.168.248.2 port 1055 connected to 192.168.248.1 port 5201 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 301 MBytes 2.52 Gbits/sec 0 1.28 MBytes [ 5] 1.00-2.00 sec 335 MBytes 2.81 Gbits/sec 0 1.28 MBytes [ 5] 2.00-3.00 sec 336 MBytes 2.82 Gbits/sec 0 1.28 MBytes [ 5] 3.00-4.00 sec 336 MBytes 2.82 Gbits/sec 0 1.28 MBytes [ 5] 4.00-5.00 sec 335 MBytes 2.81 Gbits/sec 0 1.28 MBytes [ 5] 5.00-6.00 sec 336 MBytes 2.82 Gbits/sec 0 1.28 MBytes [ 5] 6.00-7.00 sec 336 MBytes 2.82 Gbits/sec 0 1.28 MBytes [ 5] 7.00-8.00 sec 335 MBytes 2.81 Gbits/sec 0 1.28 MBytes [ 5] 8.00-9.00 sec 336 MBytes 2.82 Gbits/sec 0 1.28 MBytes [ 5] 9.00-10.00 sec 336 MBytes 2.82 Gbits/sec 0 1.28 MBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 3.25 GBytes 2.79 Gbits/sec 0 sender [ 5] 0.00-10.00 sec 3.25 GBytes 2.79 Gbits/sec receiver iperf Done.
Additional test for MTU up to 65K.
Diff Detail
- Repository
- rS FreeBSD src repository - subversion
- Lint
Lint Not Applicable - Unit
Tests Not Applicable
Event Timeline
Adding jumbo frame support looks good to me. However, is it better to support this in ether_ioctl() instead of a driver-specific ioctl handler? Check of (ifr->ifr_mtu > ETHERMTU) in ether_ioctl() can be changed to check if the interface has IFCAP_JUMBO_MTU or not.
sys/net/if_vxlan.c | ||
---|---|---|
2273 ↗ | (On Diff #54585) | I think the maximum value should be calculated by considering encap overhead. VXLAN adds a header with 50-100 byte long depending on the outer protocol. |
I think that mtu handling in ether_ioctl () requires more work and must be done very carefully, because it is used by many other drivers. Some drivers set the IFCAP_JUMBO_MTU flag, but limit the maximum MTU size to less than 9000 bytes. Therefore, it is not so easy to handle the various requirements for drivers in ether_ioctl (), and failback to the standard MTU (1500) size may be reasonable.
Revision changes:
- Increase maximum allowed MTU to 65K - 100 bytes
I agree that we can't handle this in ether_ioctl as it varies too much by real hardware.
sys/net/if_vxlan.c | ||
---|---|---|
90 ↗ | (On Diff #59684) | If it was possible to make this derived from other constants that would be ideal. <net/ethernet.h> has ETHER_MAX_LEN_JUMBO of only 9018, but it seems some other drivers have higher maximums (9600 for cxgbe and 9728 for ixl(4)). So sadly there is no good constant for the 64k max. Is there an expression that would give you the max overhead? If so you could use something like: '(65535 - sizeof(struct vxlan_header) - ETHER_HDR_LEN - ETHER_CRC_LEN - ETHER_VLAN_VCAP_LEN) if_ixl.c has something like that for setting the MTU. What I don't know is what the possible encapsulation overhead is for this interface. |
VXLAN encapsulate ethernet frames within UDP/IP packets. So, we can calculate maximum overhead for IPv4:
- IP_MAXPACKET = 65K - constant from netinet/ip.h.
- Maximum IP header length IP_MAX_HDR_LEN (There is no suitable constant for it.) = 60 as the Internet Header Length field is the unsigned 4-bit number of 32-bit words - 15 * 32 = 480 bit = 60 bytes.
- sizeof(struct udphdr) = 8 bytes.
- sizeof(struct vxlan_header) = 8 bytes.
- Inner frame ETHER_HDR_LEN = 14 bytes.
- Inner frame ETHER_CRC_LEN = 4 bytes.
- Inner frame ETHER_VLAN_ENCAP_LEN = 4 bytes.
The result VXLAN_MAX_MTU = IP_MAXPACKET - IP_MAX_HDR_LEN - sizeof(struct udphdr) - sizeof(struct vxlan_header) - ETHER_HDR_LEN - ETHER_CRC_LEN - ETHER_VLAN_ENCAP_LEN = 65535 - 60 - 8 - 8 - 14 - 4 - 4 = 65437
So overhead is 98 bytes.
Unfortunately for IPv6, the maximum header length is not fixed and can be extended in the future. It is usually 40 bytes.
Revision changes:
- Calculate VXLAN_MAX_MTU using existing constants.