bhyve: add support for virtio-net mergeable rx buffers
ClosedPublic
Actions

Authored by vmaffione on Jul 20 2019, 9:29 AM.

Details

Reviewers

markj
jhb
bryanv
pmooney_pfmooney.com

Group Reviewers

bhyve

Commits

rS355116: MFC r354552, r354864
rS354552: bhyve: add support for virtio-net mergeable rx buffers

Summary

Mergeable rx buffers is a virtio-net feature that allows the hypervisor to use multiple RX descriptor chains to receive a single receive packet. Without this feature, a TSO-enabled guest is compelled to publish only 64K (or 32K) long chains, and each of these large buffers is consumed to receive a single packet, even a very short one. This is a waste of memory, as a RX queu has room for 256 chains, which means up to 16MB of buffer memory for each (single-queue) vtnet device.
With the feature on, the guest can publish 2K long chains, and the hypervisor can merge them as needed.

This change also enables the feature in the netmap backend, which supports virtio-net offloads.
The plan is to add support to the tap backend too.
Note that differently from QEMU/KVM, here we implement one-copy receive, while QEMU uses two copies.

This patch depends on https://reviews.freebsd.org/D20987

Test Plan

Two VMs connected on the same VALE switch. Debug kernel (GENERIC).
I ran netperf TCP_MAERTS and TCP_STREAM.
With mergeable RX buffers on: ~7.5 Gbps
Without mergeable RX buffers: ~6.5 Gbps

More testing appreciated (maybe GENERIC-NODEBUG).

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

vmaffione created this revision.Jul 20 2019, 9:29 AM

Herald added a reviewer: bhyve. · View Herald TranscriptJul 20 2019, 9:29 AM

Herald added subscribers: bcran, rgrimes, imp. · View Herald Transcript

Harbormaster completed remote builds in B25438: Diff 59962.Jul 20 2019, 9:29 AM

vmaffione retitled this revision from bhyve: add support virtio-net mergeable rx buffers to bhyve: add support for virtio-net mergeable rx buffers.Jul 20 2019, 9:29 AM

afedorov added a subscriber: afedorov.Jul 22 2019, 6:31 AM

afedorov added inline comments.

usr.sbin/bhyve/pci_virtio_net.c
228 ↗	(On Diff #59962)	I catched this assert(n==0) with my tests - two ubuntu 16.04 VM + vale switch. It seems, there are nothing to prevent vq_getchain() return 0.

Fix issue identified by @aleksandr.fedorov_itglobal.com

Harbormaster completed remote builds in B25460: Diff 60029.Jul 22 2019, 7:27 PM

vmaffione marked an inline comment as done.Jul 22 2019, 7:30 PM

vmaffione added inline comments.

usr.sbin/bhyve/pci_virtio_net.c
228 ↗	(On Diff #59962)	Thank you, this helps. I forgot to check that chains after the first one are indeed available. Now the issue should be fixed. Could you please check what happens with your testbed now?

I tested the updated patch with iperf3 in various combinations:

vm (ubuntu 16.04) - vale - vm (ubuntu 16.04)
vm (freebsd 13) - vale - vm (ubuntu 16.04)
vm (ubuntu 16.04) - vale - host(if_epair)
vm (freebsd 13) - vale - host(if_epair)

And I didn't find any problems.

Thanks. Did you notice any change in terms of performance?

Sorry, but I didn't compare the performance. I conducted my tests on a machine loaded with other tasks. The throughput between two Ubuntu 16.04 vm's floats from 16 to 18 Gbit / s, sometimes increasing up to 28 Gbit / s. FreeBSD - FreeBSD ~ 7-8 Gbit / s. But as I said, the host machine was loaded with other tasks. Also this machine has two processors, and I clearly observed NUMA effects. I will try to compare the performance on a separate test server tomorrow.

I tried to compare performance on a dedicated server.
Host: Single-processor Xeon E5-2630 v4 @ 2.20GHz, 128 GB RAM, FreeBSD latest CURRENT.

iperf3 tests.

Before patching:

VM (Ubuntu 16.04) - vale - VM (Ubuntu 16.04) ~ 21,9 Gbit/s
VM (FreeeBSD CURRENT) - vale -VM (FreeBSD CURRENT) ~6.0 Gbit/s
VM (FreeBSD 12R) - vale - VM (FreeBSD 12R) ~11,2 Gbit/s

With mergable buffers:

VM (Ubuntu 16.04) - vale - VM (Ubuntu 16.04) ~ 27,3 Gbit/s
VM (FreeeBSD CURRENT) - vale -VM (FreeBSD CURRENT) ~6.3 Gbit/s
VM (FreeBSD 12R) - vale - VM (FreeBSD 12R) ~12,2 Gbit/s

So, for Ubuntu VM there is a clear increase in throughput.
It seems, that FreeBSD VM reaches a limit earlier than the difference can be noticeable.

Thanks a lot for your effort!
To be honest I'm not sure why the throughput increases so much, since TSO (64KB unchecksummed packets) is being used in both cases.
The main difference is that with mergeable rx buffers there is less pressure on the guest memory allocators, since the driver can allocate 2K clusters, rather than bigger packets.
Also, with mergeable rx buffer bhyve may do a little more work, because it needs to call vq_gechain() 33 times in order to receive each packet.
In any case, your results look very good to me, and they also agree with mine (taken with a smaller and less powerful machine).

mgrooms_shrew.net added a subscriber: mgrooms_shrew.net.Aug 9 2019, 7:46 PM

Any opinions on this change?

vmaffione mentioned this in D21263: tap: add support for virtio-net offloads.Aug 14 2019, 2:32 PM

vmaffione mentioned this in D21315: bhyve: tap: add support for offloads.Oct 20 2019, 10:09 AM

This looks ok to me generally. Do you need to refresh this after other recent commits?

Thanks for looking at this.
No, the patch is ready as is (I just retested everything again).

jhb accepted this revision.Nov 8 2019, 5:24 PM

This revision is now accepted and ready to land.Nov 8 2019, 5:24 PM

Closed by commit rS354552: bhyve: add support for virtio-net mergeable rx buffers (authored by vmaffione). · Explain WhyNov 8 2019, 5:57 PM

This revision was automatically updated to reflect the committed changes.

vmaffione added a commit: rS354552: bhyve: add support for virtio-net mergeable rx buffers.

vmaffione added a commit: rS355116: MFC r354552, r354864.Nov 26 2019, 6:11 PM