Page MenuHomeFreeBSD

iflib: Fix handling of mbuf cluster allocation failures.
ClosedPublic

Authored by markj on Jun 27 2020, 3:36 PM.

Details

Summary


- When refilling an rx freelist, make sure we only update the hw
producer index if at least one cluster was allocated. Also make sure
that we don't update the fragment index cursor if the last allocation
attempt didn't succeed. For Intel drivers, iflib basically assumes
that the consumer index and fragment index cursor stay in lock step,
but this assumption was getting violated, resulting in use-after-frees
and NULL pointer dereferences.

Test Plan

Peter Holm was reporting occasional mbuf cluster use-after-frees
that were tracked back to these bugs. He verified that they are no
longer reproducible with this patch.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

markj requested review of this revision.Jun 27 2020, 3:36 PM
markj edited the test plan for this revision. (Show Details)

Don't attempt to handle m_gethdr() failures. Once clusters are placed
in the ring, we don't need to update the bitmap right away, so there's
no harm in simply leaving it there.

I added Hans, as he's also got an attempt to fix the same problem (https://reviews.freebsd.org/D24236)

sys/net/iflib.c
1987 ↗(On Diff #73817)

while you're adding __predict_false(), this seems like a good place for one..

This revision is now accepted and ready to land.Jun 30 2020, 5:01 PM

Hi,

I think a custom test to provoke the failure and make sure it is really gone is required. I simply added a sysctl which triggered allocation failures at a random point in the loop you are patching.

--HPS

Hi,

I think a custom test to provoke the failure and make sure it is really gone is required. I simply added a sysctl which triggered allocation failures at a random point in the loop you are patching.

--HPS

@pho has been able to trigger this bug reliably with stress2 since we discovered that it was triggered by cluster allocation failures. I used fail(9) when working on the patch but do not want to leave it in since this is a performance-sensitive path.

Sprinkle more __predict_false().

This revision now requires review to proceed.Jul 1 2020, 2:02 PM
markj marked an inline comment as done.Jul 1 2020, 2:02 PM

Any objection to committing this? It has been tested extensively by Peter.

No objections from me. I'll give it a spin at Mellanox once it hits the tree.

This revision is now accepted and ready to land.Jul 6 2020, 2:22 PM