The vsc_rx_ready and the RX virtqueue is protected by the rx_mtx lock.
However, pci_vtnet_ping_rxq() (currently called only once after each device reset)
accesses those without acquiring the lock.
Details
- Reviewers
bryanv jhb markj - Group Reviewers
bhyve - Commits
- rS349740: bhyve: vtnet: fix locking on receive
rS349696: MFC r349175
rS349175: bhyve: vtnet: fix locking on receive
Diff Detail
- Repository
- rS FreeBSD src repository - subversion
- Lint
Lint Skipped - Unit
Tests Skipped
Event Timeline
It's not obvious to me why queue notify can't be a common operation, just looking at callers of virtqueue_notify(). Why do you say that it only happens on reset?
Did you consider doing a double check, i.e.,
if (sc->vsc_rx_ready == 0) pthread_mutex_lock(); if (sc->vsc_rx_ready == 0) { ... } pthread_mutex_unlock(); }
?
Do you mean a common vq_notify callback for both transmit and receive queue? But pci_vtnet_ping_txq() and pci_vtnet_ping_rxq() perform different work ...
The point here is that the virtio-net frontend does not support (yet) receive backpressure towards the backend. This means that if the backend receives packets faster than the guest can consume them,
packets are dropped (see https://svnweb.freebsd.org/base/head/usr.sbin/bhyve/pci_virtio_net.c?view=markup#l311), wasting CPU cycles. Adding backpressure would mean to disable the backend packet receive until the guest is ready to receive more.
Because receive backpressure is missing, receive kicks are kept always disabled. On device reset, kicks are actually enabled. When the first receive kick comes (pci_vtnet_ping_rxq), kicks are disabled (vq_kick_disable), and never enabled again. This explains why that pci_vtnet_ping_rxq is called only once for each reset.
Another consequence of the missing receive backpressure is that receive operation in the backend is enabled while the reset is in progress. In other words, pci_vtnet_tap_rx() can be called while the reset is in progress, and that's why we need the vsc_rx_ready variable. If we had receive backpressure, one could just disable backend receive operation while the reset is in progress.
I plan to implement receive backpressure (plus other stuff), but more preparatory patches are needed before getting to that point.