Page MenuHomeFreeBSD

if_vxlan(4) Allow set MTU more than 1500 bytes.
ClosedPublic

Authored by aleksandr.fedorov_itglobal.com on Mar 1 2019, 3:50 PM.

Details

Summary

It seems, there are no reason to prevent setting MTU more than 1500 bytes.
MTU greater than 1500 gives a significant increase in throughput.

Test Plan

iperf3 tests between two machines using vxlan over 10Gbit network with various MTU.

Test 1. vxlan MTU -1500, physical network MTU - 9000

# iperf3 -c 192.168.248.1
Connecting to host 192.168.248.1, port 5201
[  5] local 192.168.248.2 port 1050 connected to 192.168.248.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   175 MBytes  1.46 Gbits/sec    0   1.27 MBytes       
[  5]   1.00-2.01   sec   194 MBytes  1.63 Gbits/sec    0   1.27 MBytes       
[  5]   2.01-3.00   sec   195 MBytes  1.64 Gbits/sec    0   1.27 MBytes       
[  5]   3.00-4.01   sec   196 MBytes  1.64 Gbits/sec    0   1.27 MBytes       
[  5]   4.01-5.00   sec   195 MBytes  1.64 Gbits/sec    0   1.27 MBytes       
[  5]   5.00-6.00   sec   195 MBytes  1.63 Gbits/sec    0   1.27 MBytes       
[  5]   6.00-7.00   sec   196 MBytes  1.64 Gbits/sec    0   1.27 MBytes       
[  5]   7.00-8.00   sec   194 MBytes  1.64 Gbits/sec    0   1.27 MBytes       
[  5]   8.00-9.00   sec   191 MBytes  1.60 Gbits/sec  493    968 KBytes       
[  5]   9.00-10.00  sec   193 MBytes  1.62 Gbits/sec    0   1.15 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1.88 GBytes  1.61 Gbits/sec  493             sender
[  5]   0.00-10.00  sec  1.88 GBytes  1.61 Gbits/sec                  receiver

iperf Done.

Test 2. vxlan MTU -8900, physical network MTU - 9000

# iperf3 -c 192.168.248.1
Connecting to host 192.168.248.1, port 5201
[  5] local 192.168.248.2 port 1052 connected to 192.168.248.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   585 MBytes  4.90 Gbits/sec    0   1.28 MBytes
[  5]   1.00-2.00   sec   655 MBytes  5.50 Gbits/sec    0   1.28 MBytes
[  5]   2.00-3.00   sec   656 MBytes  5.50 Gbits/sec    0   1.28 MBytes
[  5]   3.00-4.00   sec   655 MBytes  5.50 Gbits/sec    0   1.28 MBytes
[  5]   4.00-5.00   sec   656 MBytes  5.50 Gbits/sec    0   1.28 MBytes
[  5]   5.00-6.00   sec   655 MBytes  5.50 Gbits/sec    0   1.28 MBytes
[  5]   6.00-7.00   sec   656 MBytes  5.50 Gbits/sec    0   1.28 MBytes
[  5]   7.00-8.00   sec   658 MBytes  5.51 Gbits/sec    0   1.28 MBytes
[  5]   8.00-9.00   sec   655 MBytes  5.50 Gbits/sec    0   1.28 MBytes
[  5]   9.00-10.00  sec   655 MBytes  5.50 Gbits/sec    0   1.28 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  6.33 GBytes  5.44 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  6.33 GBytes  5.44 Gbits/sec                  receiver

iperf Done.

Test 3. vxlan MTU -8900, physical network MTU - 1500

# iperf3 -c 192.168.248.1
Connecting to host 192.168.248.1, port 5201
[  5] local 192.168.248.2 port 1055 connected to 192.168.248.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   301 MBytes  2.52 Gbits/sec    0   1.28 MBytes
[  5]   1.00-2.00   sec   335 MBytes  2.81 Gbits/sec    0   1.28 MBytes
[  5]   2.00-3.00   sec   336 MBytes  2.82 Gbits/sec    0   1.28 MBytes
[  5]   3.00-4.00   sec   336 MBytes  2.82 Gbits/sec    0   1.28 MBytes
[  5]   4.00-5.00   sec   335 MBytes  2.81 Gbits/sec    0   1.28 MBytes
[  5]   5.00-6.00   sec   336 MBytes  2.82 Gbits/sec    0   1.28 MBytes
[  5]   6.00-7.00   sec   336 MBytes  2.82 Gbits/sec    0   1.28 MBytes
[  5]   7.00-8.00   sec   335 MBytes  2.81 Gbits/sec    0   1.28 MBytes
[  5]   8.00-9.00   sec   336 MBytes  2.82 Gbits/sec    0   1.28 MBytes
[  5]   9.00-10.00  sec   336 MBytes  2.82 Gbits/sec    0   1.28 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  3.25 GBytes  2.79 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  3.25 GBytes  2.79 Gbits/sec                  receiver

iperf Done.

Additional test for MTU up to 65K.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

rgrimes accepted this revision as: rgrimes.Mar 1 2019, 4:56 PM
This revision is now accepted and ready to land.Mar 1 2019, 4:56 PM
hrs added a comment.Mar 12 2019, 7:33 AM

Adding jumbo frame support looks good to me. However, is it better to support this in ether_ioctl() instead of a driver-specific ioctl handler? Check of (ifr->ifr_mtu > ETHERMTU) in ether_ioctl() can be changed to check if the interface has IFCAP_JUMBO_MTU or not.

sys/net/if_vxlan.c
2273 ↗(On Diff #54585)

I think the maximum value should be calculated by considering encap overhead. VXLAN adds a header with 50-100 byte long depending on the outer protocol.

aleksandr.fedorov_itglobal.com edited the test plan for this revision. (Show Details)

I think that mtu handling in ether_ioctl () requires more work and must be done very carefully, because it is used by many other drivers. Some drivers set the IFCAP_JUMBO_MTU flag, but limit the maximum MTU size to less than 9000 bytes. Therefore, it is not so easy to handle the various requirements for drivers in ether_ioctl (), and failback to the standard MTU (1500) size may be reasonable.

Revision changes:

  • Increase maximum allowed MTU to 65K - 100 bytes
This revision now requires review to proceed.Jul 12 2019, 10:14 AM
aleksandr.fedorov_itglobal.com marked an inline comment as done.Jul 12 2019, 10:21 AM
jhb added a comment.Jul 12 2019, 5:33 PM

I agree that we can't handle this in ether_ioctl as it varies too much by real hardware.

sys/net/if_vxlan.c
90 ↗(On Diff #59684)

If it was possible to make this derived from other constants that would be ideal. <net/ethernet.h> has ETHER_MAX_LEN_JUMBO of only 9018, but it seems some other drivers have higher maximums (9600 for cxgbe and 9728 for ixl(4)). So sadly there is no good constant for the 64k max. Is there an expression that would give you the max overhead? If so you could use something like:

'(65535 - sizeof(struct vxlan_header) - ETHER_HDR_LEN - ETHER_CRC_LEN - ETHER_VLAN_VCAP_LEN)

if_ixl.c has something like that for setting the MTU. What I don't know is what the possible encapsulation overhead is for this interface.

VXLAN encapsulate ethernet frames within UDP/IP packets. So, we can calculate maximum overhead for IPv4:

  • IP_MAXPACKET = 65K - constant from netinet/ip.h.
  • Maximum IP header length IP_MAX_HDR_LEN (There is no suitable constant for it.) = 60 as the Internet Header Length field is the unsigned 4-bit number of 32-bit words - 15 * 32 = 480 bit = 60 bytes.
  • sizeof(struct udphdr) = 8 bytes.
  • sizeof(struct vxlan_header) = 8 bytes.
  • Inner frame ETHER_HDR_LEN = 14 bytes.
  • Inner frame ETHER_CRC_LEN = 4 bytes.
  • Inner frame ETHER_VLAN_ENCAP_LEN = 4 bytes.

The result VXLAN_MAX_MTU = IP_MAXPACKET - IP_MAX_HDR_LEN - sizeof(struct udphdr) - sizeof(struct vxlan_header) - ETHER_HDR_LEN - ETHER_CRC_LEN - ETHER_VLAN_ENCAP_LEN = 65535 - 60 - 8 - 8 - 14 - 4 - 4 = 65437
So overhead is 98 bytes.

Unfortunately for IPv6, the maximum header length is not fixed and can be extended in the future. It is usually 40 bytes.

Revision changes:

  • Calculate VXLAN_MAX_MTU using existing constants.
aleksandr.fedorov_itglobal.com marked an inline comment as done.Jul 15 2019, 8:43 AM
jhb accepted this revision.Jul 15 2019, 7:52 PM

Thanks!

This revision is now accepted and ready to land.Jul 15 2019, 7:52 PM

Can anyone commit this patch?

This revision was automatically updated to reflect the committed changes.