This review is a continuation of D22440.
This patch enforces the requirement that the RX callback cannot be called after a reset until the features have been negotiated.
Please take a look:
https://reviews.freebsd.org/D22440
https://svnweb.freebsd.org/base?view=revision&revision=354864
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=242023
I found an issue with bhyve + FreeBSD guest:
guest-freebsd# ifconfig vtnet0 down
vtnet: ndesc (3242) out of range, driver confused?
Assertion failed: (n >= 1 && riov_len + n <= VTNET_MAXSEGS), function pci_vtnet_rx, file /afedorov/freebsd-develop/usr.sbin/bhyve/pci_virtio_net.c, line 309.
Abort trap (core dumped)and
Nov 25 20:25:40 af-12-1 syslogd: exiting on signal 15
Waiting (max 60 seconds) for system process `vnlru' to stop... done
Waiting (max 60 seconds) for system process `syncer' to stop...
Syncing disks, vnodes remaining... 2 1 0 done
Waiting (max 60 seconds) for system thread `bufdaemon' to stop... done
Waiting (max 60 seconds) for system thread `bufspacedaemon-0' to stop... done
Waiting (max 60 seconds) for system thread `bufspacedaemon-1' to stop... done
Waiting (max 60 seconds) for system thread `bufspacedaemon-2' to stop... done
Waiting (max 60 seconds) for system thread `bufspacedaemon-3' to stop... done
All buffers synced.
Uptime: 1h1m8s
vtnet: ndesc (1347) out of range, driver confused?
Assertion failed: (n >= 1 && riov_len + n <= VTNET_MAXSEGS), function pci_vtnet_rx, file /afedorov/freebsd-develop/usr.sbin/bhyve/pci_virtio_net.c, line 309.
Abort trap (core dumped)
root@q1u001:/afedorov/vm #As you can see, bhyve crashes in two cases. When ifconfig vtnet0 down and shutdown -p now are executed from the guest.
The main issue is a race condition where the receive callback is called during device reset.
From the side of bhyve it looks like:
pci_vtnet_reset():
netbe_rx_disable(sc->vsc_be);
vi_reset_dev(&sc->vsc_vs):
vq->vq_last_avail = 0;
pci_vtnet_ping_rxq()
netbe_rx_enable(sc->vsc_be);
pci_vtnet_rx():
n = vq_getchain():
idx = vq->vq_last_avail; /* Equal zero!!! */
ndesc = (uint16_t)((u_int)vq->vq_avail->va_idx - idx);
if (ndesc > vq->vq_qsize) return (-1)
assert(n >= 1 && riov_len + n <= VTNET_MAXSEGS);In revision 354864, we introduced turning off RX on device reset. But this is not enough, since after pci_vtnet_reset () the guest can call pci_vtnet_ping_rxq () which re-enables RX.
I have not been able to reproduce this situation with a Linux guest. So it might be a bug in the FreeBSD guest driver. But since we already have several releases with this driver (11.4, 12.2, pfSense etc), I think it would be nice to fix it in bhyve.