- User Since
- Jan 18 2019, 4:52 AM (62 w, 4 d)
Sun, Mar 29
Sat, Mar 28
- Allow entering MTU value in HEX.
Thu, Mar 19
I tested this patch in our test lab with a lot of netgraph nodes. And I find it very useful.
I look at this issue from network virtualization point of view.
I have a plan (and patches) for adding native Netgraph support to the Bhyve network backend through ng_socket(4).
For this case, it is very interesting to be able to create ng_bridge(4) with the following options:
- One "uplink" hook with learning turned off, but all unknown MAC's go through it.
- The rest of the hooks have learning enabled, but unknown MAC's are not sent to them.
Wed, Mar 18
Can we come to some kind of consensus?
Mon, Mar 16
- Add additional checks (see Test Plan).
I understand your opinion, but because we do not know what kind of OS will be used as a guest, I think additional checks will not be superfluous. Also, I think it's easier to look for errors if bhyve crashes earlier.
Sat, Mar 14
Fri, Mar 13
- Fix typo.
- Enable VIRTIO_NET_F_MTU only if user provide mtu argument.
Yes, I tested various MTU values (including 9000) with different backends (if_tuntap(4), vale). I did not find any regressions without using the mtu argument.
Thu, Mar 12
- Fix typo.
- Disable VIRTIO_NET_F_MQ flag.
- Check the lower boundary of the MTU with ETHERMIN.
Wed, Mar 11
I agree that the lower bound looks weird. And it came from Linux:
Tue, Mar 10
- Move checks of the MTU value to pci_virtio_net.c and add the corresponding min/max definitions from the virtio specification.
- Fix default MTU value.
- Add VIRTIO_NET_F_MQ capability, and initialize max_virtqueue_pairs.
Fri, Mar 6
Oh, I'm glad to hear that. May you comment on support from Bhyve side?
Thu, Mar 5
Is it really useful to have multiple uplinks?
Feb 28 2020
Feb 26 2020
Feb 20 2020
I tested this patch with various OS’s(windows, Ubuntu, centos) and didn’t find any issues. So, I think it’s ready to go.
Feb 18 2020
Sorry, I didn’t have time to test the patch. But as for the code, I have no questions.
Feb 12 2020
divert sockets can be used with other software but present exactly same significant overhead.
Do you have an example when suggested change improves performance really?
Ping. Are there any issues that prevent committing these changes?
divert(4) sockets can be used not only with natd(8), so the changes looks reasonable for me.
Feb 7 2020
Jan 30 2020
Yes indeed. But also the Linux virtio-net driver should do the same when mergeable rx buffers are not allocated.... so it is weird that it does not work in that case!
Hah, I checked fFreeBSD 12 guest and it works. It seems that, the FreeBSD guest driver allocates descriptors of size >= MTU.
Have you set mtu 9000 on the vtnet0 in the guest?
My tests were VM-2-VM, both freebsd guests.
Thanks for the suggestion. I retried with -D (on head), and it still works. I cannot reproduce the broken jumbo-frames+tap...
Jan 27 2020
The changes looks good to me.
Jan 24 2020
Jan 22 2020
I think this patch is the right direction. Now it’s really very difficult to add a storage backend other than file-like operations.
Jan 16 2020
Jan 14 2020
Dec 25 2019
Dec 24 2019
I catch this KASSERT (https://svnweb.freebsd.org/base/head/sys/amd64/vmm/vmm_dev.c?view=markup#l799) when trying to attach debugger to bhyve process.
Dec 4 2019
We are also interested in this interface.
It looks good for use with vale (4) to connect a virtual network to the host stack: bhyve VM's - vale (4) - if_vether (4) - Host. I tested this and it works well. But now we use if_epair (4) for this configuration, because I didn't know about the existence of the port.
Nov 26 2019
I welcome these changes. The current debug output looks like a mess.
Moreover, part of the files declares (W/D)PRINTF as
Nov 19 2019
The race would not have been fatal (but is still good to fix) if the default been the more compatible behavior.
Nov 9 2019
The changes look good to me. And I can not reproduce the bug https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=241808
Oct 25 2019
I think you need to add people with good knowledge of the netgraph to the reviewers: JulianElischer, Gleb Smirnoff and Alexander Motin. Because apparently not all are subscribed to the network mailing list.
Sep 14 2019
Sep 4 2019
I tested this patch in various configurations using iperf3:
Aug 14 2019
Ops, you’re right. Sorry for the noise.
Aug 13 2019
It’s seems I found another incorrect solution:
Jul 31 2019
Jul 24 2019
I tried to compare performance on a dedicated server.
Host: Single-processor Xeon E5-2630 v4 @ 2.20GHz, 128 GB RAM, FreeBSD latest CURRENT.
Jul 23 2019
Sorry, but I didn't compare the performance. I conducted my tests on a machine loaded with other tasks. The throughput between two Ubuntu 16.04 vm's floats from 16 to 18 Gbit / s, sometimes increasing up to 28 Gbit / s. FreeBSD - FreeBSD ~ 7-8 Gbit / s. But as I said, the host machine was loaded with other tasks. Also this machine has two processors, and I clearly observed NUMA effects. I will try to compare the performance on a separate test server tomorrow.
I tested the updated patch with iperf3 in various combinations:
- vm (ubuntu 16.04) - vale - vm (ubuntu 16.04)
- vm (freebsd 13) - vale - vm (ubuntu 16.04)
- vm (ubuntu 16.04) - vale - host(if_epair)
- vm (freebsd 13) - vale - host(if_epair)
Jul 22 2019
Jul 19 2019
I tested this path in our test lab. It’s works fine.
Jul 17 2019
Jul 16 2019
Can anyone commit this patch?
Jul 15 2019
VXLAN encapsulate ethernet frames within UDP/IP packets. So, we can calculate maximum overhead for IPv4:
- IP_MAXPACKET = 65K - constant from netinet/ip.h.
- Maximum IP header length IP_MAX_HDR_LEN (There is no suitable constant for it.) = 60 as the Internet Header Length field is the unsigned 4-bit number of 32-bit words - 15 * 32 = 480 bit = 60 bytes.
- sizeof(struct udphdr) = 8 bytes.
- sizeof(struct vxlan_header) = 8 bytes.
- Inner frame ETHER_HDR_LEN = 14 bytes.
- Inner frame ETHER_CRC_LEN = 4 bytes.
- Inner frame ETHER_VLAN_ENCAP_LEN = 4 bytes.
Jul 12 2019
I think that mtu handling in ether_ioctl () requires more work and must be done very carefully, because it is used by many other drivers. Some drivers set the IFCAP_JUMBO_MTU flag, but limit the maximum MTU size to less than 9000 bytes. Therefore, it is not so easy to handle the various requirements for drivers in ether_ioctl (), and failback to the standard MTU (1500) size may be reasonable.
Jul 1 2019
I tested the latest version of the patch on real hardware with different operating systems and different MTUs, using iperf3 and ping -f -c. It seems everything works as expected.
VM1 (Ubuntu 16.04) - VALE - VM2 (Ubuntu 16.04) iperf3 speed 16 Gbits/s
VM1 (FreeBSD 12) - VALE - VM2 (FreeBSD 12) iperf3 speed 11 Gbits/s
VM1 (Ubuntu 16.04) - VALE - VM2 (FreeBSD 12) iperf3 speed 14 Gbits/s
Jun 28 2019
Jun 7 2019
Rename vq_get_mrgrx_bufs() to vq_getbufs_mrgrx() and vq_relchain_mrgrx() to vq_relbufs_mrgrx() to to corresponding overall style.
- Reuse vq_getchain() to handle various negotiated features (TSO, MRG_RXBUF, INDIRECT descriptors).
- Add two helper functions vq_get_mrgrx_bufs() and vq_relchain_mrgrx() and move it to virtio.[ch]
May 30 2019
Vincenzo, you are right. I'm trying to rewrite the code to handle various negotated features. But I need some more time to test it.
May 23 2019
Change r->cur to r->head in RX path
Vincenzo, thanks for the review.
The main motivation to write the custom method is optimization. Function vq_getchain() returns virtio descriptors which chained using VRING_DESCR_F_NEXT flag. But viritio-net guest drivers do not represent the available descriptors in the form of a chain, so vq_getchain() returns only one descriptor per call. Also this function has side effect - it's increment vq->vq_last_avail index, which should be decremented if there is no enough descriptors to store the RX packet.
May 22 2019
May 20 2019
Fix incorrect usage of 'cur_rx_ring' on TX path.
May 17 2019
May 16 2019
Mar 1 2019
Feb 18 2019
Sorry for the long delay. My tests didn't find any regression.
Feb 14 2019
Oops, I forgot this. I will test the current patch.
Feb 13 2019
Jan 30 2019
I did not find any problems with the latest patch.
Jan 29 2019
I'm worried about the increased load on the kernel thread that serves the taskqueue_swi and how it will affect the rest of the system.
Jan 25 2019
Looks good. I will try to test this patch tomorrow.
I conducted the following test. Simultaneous start:
- 8 (RX) netmap_test processes (D18876) on different vale (0 - 7) ports.
- 8 (TX) pkt-gen processes on different vale (0 - 7) ports
Jan 18 2019
Everything works fine with the latest patch (Diff 2).
I tested the patch with two bhyve vm connected through a VALE switch, there is no panic.