Page MenuHomeFreeBSD

Check for Rx ring state to prevent from stall in the ENA driver
ClosedPublic

Authored by mw on Oct 31 2017, 1:33 PM.
Tags
None
Referenced Files
Unknown Object (File)
Oct 22 2024, 5:07 PM
Unknown Object (File)
Oct 22 2024, 5:06 PM
Unknown Object (File)
Oct 22 2024, 5:06 PM
Unknown Object (File)
Oct 22 2024, 5:06 PM
Unknown Object (File)
Oct 22 2024, 4:36 PM
Unknown Object (File)
Oct 4 2024, 2:21 PM
Unknown Object (File)
Oct 4 2024, 10:31 AM
Unknown Object (File)
Oct 3 2024, 2:28 PM

Details

Summary

In case when Rx ring is full and driver will fail to allocate Rx mbufs,
the ring could be stalled.

Keep alive is checking every second for Rx ring state, and if it is full
for two cycles, then trigger rx_cleanup routine in another thread.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

sys/dev/ena/ena.c
876 ↗(On Diff #34519)

Suggest making this a M_WAITOK flag and remove lines 879-884, 899-900. This will keep it consistent with the earlier malloc calls in this routine.

930 ↗(On Diff #34519)

nit. Make this
while (taskqueue_cancel(rx_ring->cmpl_tq, &rx_ring->cmpl_task, NULL) != 0)
to match the style in this file.

3470 ↗(On Diff #34519)

Is 2 sec picked because this value must be < DEFAULT_KEEP_ALIVE_TO which 6 sec?

mk_semihalf.com marked 2 inline comments as done.
  • Change M_NOWAIT to M_WAITOK on taskqueue_create_fast
  • Style fix
sys/dev/ena/ena.c
3470 ↗(On Diff #34519)

It is not unit of time, this is indicating how many times in the row we must detect that rx_ring hadn't available descriptors before we will trigger deferred rx cleanup (in another thread, because it is highly possible that interrupts are stalled and that's why the ring wasn't cleaned up). Please take a look at the function check_for_empty_rx_ring() and how we are using it there.

byenduri_gmail.com added inline comments.
sys/dev/ena/ena.c
3470 ↗(On Diff #34519)

OK. I see check_for_empty_rx_ring() is called from the ena_timer_service() routine which is scheduled every second. Hence we will wait a minimum of 2 seconds before triggering deferred cleanup. I was curious as to why 2 seconds and not some other number (say 30 seconds).

This revision is now accepted and ready to land.Nov 8 2017, 5:43 PM
mw edited reviewers, added: mk_semihalf.com; removed: mw.
This revision was automatically updated to reflect the committed changes.