Page MenuHomeFreeBSD

Fix panic in iflib when out of mbuf clusters
AbandonedPublic

Authored by hselasky on Mar 31 2020, 12:38 PM.
Tags
None
Referenced Files
Unknown Object (File)
Dec 20 2023, 6:06 AM
Unknown Object (File)
Dec 15 2023, 6:53 PM
Unknown Object (File)
Sep 7 2023, 8:41 PM
Unknown Object (File)
Jun 26 2023, 2:27 PM
Unknown Object (File)
Apr 25 2023, 7:10 PM
Unknown Object (File)
Apr 8 2023, 12:43 AM
Unknown Object (File)
Apr 5 2023, 11:13 PM
Subscribers

Details

Reviewers
erj
gallatin
shurd
Group Reviewers
iflib
Summary

Fix panic in iflib when out of mbuf clusters by not flushing the RX descriptor in that case.

Backtrace:
panic: Assertion *sd->ifsd_cl != NULL failed at /usr/src/sys/net/iflib.c:2632
iflib_rxeof()
_task_fn_rx()
gtaskqueue_run_locked()
gtaskqueue_thread_loop()
fork_exit()

Sponsored by: Mellanox Technologies

Test Plan

Inserted probe in code to force error path.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 30205

Event Timeline

Picture a case where you're asked to refill 128 slots, and fail after 127. Don't you want to at least flush the 127 that you were able to allocate?

Eg, do the work currently at the bottom of the loop in the (n == 0 || i == IFLIB_MAX_RX_REFRESH) when you've managed any allocations at all.

For a reason I don't fully understand, if we do exactly that, we end up with a panic in rxeof, because the cluster pointer is NULL. I suspect something goes wrong with the ifl_pidx logic.

sys/net/iflib.c
2600

This is the assertion we hit! Can you explain why?

sys/net/iflib.c
2600

I wonder if it is a bug in the lower-level driver, and it is giving the hardware ring entries beyond pidx? Aside from that, I don't know.

This happens when we only feed one and one mbuf cluster. It can easily be reproduced! What happens if pidx wraps too fast?