Differential D20609

bhyve: vtnet: fix locking on receive
ClosedPublic
Actions

Authored by vmaffione on Jun 11 2019, 5:18 PM.

Details

Reviewers

bryanv
jhb
markj

Group Reviewers

bhyve

Commits

rS349740: bhyve: vtnet: fix locking on receive
rS349696: MFC r349175
rS349175: bhyve: vtnet: fix locking on receive

Summary

The vsc_rx_ready and the RX virtqueue is protected by the rx_mtx lock.
However, pci_vtnet_ping_rxq() (currently called only once after each device reset)
accesses those without acquiring the lock.

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

vmaffione created this revision.Jun 11 2019, 5:18 PM

Herald added subscribers: bcran, rgrimes, imp. · View Herald TranscriptJun 11 2019, 5:18 PM

It's not obvious to me why queue notify can't be a common operation, just looking at callers of virtqueue_notify(). Why do you say that it only happens on reset?

Did you consider doing a double check, i.e.,

if (sc->vsc_rx_ready == 0)
    pthread_mutex_lock();
    if (sc->vsc_rx_ready == 0) {
    ...
    }
    pthread_mutex_unlock();
}

Do you mean a common vq_notify callback for both transmit and receive queue? But pci_vtnet_ping_txq() and pci_vtnet_ping_rxq() perform different work ...

The point here is that the virtio-net frontend does not support (yet) receive backpressure towards the backend. This means that if the backend receives packets faster than the guest can consume them,
packets are dropped (see https://svnweb.freebsd.org/base/head/usr.sbin/bhyve/pci_virtio_net.c?view=markup#l311), wasting CPU cycles. Adding backpressure would mean to disable the backend packet receive until the guest is ready to receive more.
Because receive backpressure is missing, receive kicks are kept always disabled. On device reset, kicks are actually enabled. When the first receive kick comes (pci_vtnet_ping_rxq), kicks are disabled (vq_kick_disable), and never enabled again. This explains why that pci_vtnet_ping_rxq is called only once for each reset.
Another consequence of the missing receive backpressure is that receive operation in the backend is enabled while the reset is in progress. In other words, pci_vtnet_tap_rx() can be called while the reset is in progress, and that's why we need the vsc_rx_ready variable. If we had receive backpressure, one could just disable backend receive operation while the reset is in progress.

I plan to implement receive backpressure (plus other stuff), but more preparatory patches are needed before getting to that point.

In D20609#445839, @v.maffione_gmail.com wrote:

Do you mean a common vq_notify callback for both transmit and receive queue? But pci_vtnet_ping_txq() and pci_vtnet_ping_rxq() perform different work ...

The point here is that the virtio-net frontend does not support (yet) receive backpressure towards the backend. This means that if the backend receives packets faster than the guest can consume them,
packets are dropped (see https://svnweb.freebsd.org/base/head/usr.sbin/bhyve/pci_virtio_net.c?view=markup#l311), wasting CPU cycles. Adding backpressure would mean to disable the backend packet receive until the guest is ready to receive more.
Because receive backpressure is missing, receive kicks are kept always disabled. On device reset, kicks are actually enabled. When the first receive kick comes (pci_vtnet_ping_rxq), kicks are disabled (vq_kick_disable), and never enabled again. This explains why that pci_vtnet_ping_rxq is called only once for each reset.
Another consequence of the missing receive backpressure is that receive operation in the backend is enabled while the reset is in progress. In other words, pci_vtnet_tap_rx() can be called while the reset is in progress, and that's why we need the vsc_rx_ready variable. If we had receive backpressure, one could just disable backend receive operation while the reset is in progress.

I plan to implement receive backpressure (plus other stuff), but more preparatory patches are needed before getting to that point.

Thanks for the explanation.

This revision is now accepted and ready to land.Jun 17 2019, 4:02 PM

Closed by commit rS349175: bhyve: vtnet: fix locking on receive (authored by vmaffione). · Explain WhyJun 18 2019, 5:51 PM

This revision was automatically updated to reflect the committed changes.

vmaffione added a commit: rS349175: bhyve: vtnet: fix locking on receive.

vmaffione added a commit: rS349696: MFC r349175.Jul 3 2019, 8:00 PM

vmaffione added a commit: rS349740: bhyve: vtnet: fix locking on receive.Jul 4 2019, 6:21 PM