Page MenuHomeFreeBSD

netmap: fix lock order reversal related to kqueue usage
AbandonedPublic

Authored by vmaffione on Jan 25 2019, 9:32 AM.

Details

Summary

When using poll(), select() or kevent() on netmap file descriptors, netmap executes the equivalent of NIOCTXSYNC and NIOCRXSYNC commands, before collecting the events that are ready. In other words, the poll/kevent callback has side effects. This is done to avoid the overhead of two system call per iteration (e.g., poll() + ioctl(NIOC*XSYNC)).

When the kqueue subsystem invokes the kqueue(9) f_event callback (netmap_knrw), it holds the lock of the
struct knlist object associated to the netmap port (the lock is provided at initialization, by calling knlist_init_mtx).
However, netmap_knrw() may need to wake up another netmap port (or even the same one), which means that it may need to call knote(). Since knote() needs the lock of the struct knlist object associated to the to-be-wake-up netmap
port, it is possible to have a lock order reversal problem (AB/BA deadlock).

This change prevents the deadlock by introducing a global lock for the struct knlist object of all the netmap ports.

Test Plan

Test bhyve using VALE with kqueue(2).
See other tests discussed here https://reviews.freebsd.org/D18876

Diff Detail

Repository
rS FreeBSD src repository
Lint
Lint Skipped
Unit
Unit Tests Skipped

Event Timeline

vmaffione created this revision.Jan 25 2019, 9:32 AM

I conducted the following test. Simultaneous start:

  • 8 (RX) netmap_test processes (D18876) on different vale (0 - 7) ports.
  • 8 (TX) pkt-gen processes on different vale (0 - 7) ports

And there was no deadlock or warning message from witness (4).

As for reproducibility of the tests, the deadlock is easily reproduced on the Xeon E5-2630 v4 with a non-patched kernel, but it is very difficult on a weak hardware like the AMD A6 APU.

Thanks for testing. So this seems to work.
However, introducing a global lock is not the best approach for performance reasons. I tried to came up with a different approach that also allows us to get rid of the mtx_owned() call.
See https://reviews.freebsd.org/D18956