Page MenuHomeFreeBSD

Fix an LLE lookup race.
ClosedPublic

Authored by markj on Jan 20 2019, 5:36 PM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Dec 5, 2:51 AM
Unknown Object (File)
Sat, Nov 23, 7:59 PM
Unknown Object (File)
Nov 19 2024, 2:05 PM
Unknown Object (File)
Nov 18 2024, 2:01 AM
Unknown Object (File)
Nov 17 2024, 5:23 PM
Unknown Object (File)
Nov 11 2024, 2:36 PM
Unknown Object (File)
Nov 11 2024, 2:35 PM
Unknown Object (File)
Nov 11 2024, 1:55 AM
Subscribers

Details

Reviewers
bz
Group Reviewers
network
Commits
rS343363: Fix an LLE lookup race.
Summary

In r334118, uses of the afdata read lock were converted to epoch
sections. In the old locking protocol for LLEs, the afdata read lock
was used to synchronize lookups with lltable table updates, which were
and are performed with the afdata write lock held. Now, the afdata
write lock doesn't block readers from observing a "doomed" LLE that is
still linked into the lltable. Such a reader may block on the LLE lock
held by the thread that is expiring the LLE, so it effectively holds a
reference on the LLE without having updated its refcount. The thread
preparing to free the LLE will release the LLE write lock after having
scheduled the LLE for a deferred free with epoch_call(); the reader may
then schedule the LLE timer, resulting in a use-after-free when the
callout fires.

Fix the problem by checking for the LLE_LINKED flag after having
acquired the LLE lock during a lookup. If the LLE was unlinked while
the thread performing the lookup was blocked, then upon acquiring the
LLE lock it should just pretend that it never saw the LLE in the first
place. The use of epoch(9) ensures that the LLE won't be freed while
we're blocked on the LLE lock.

Note that LLE_UNLOCKED lookups are unaffected, since the caller cannot
acquire a new reference to the LLE.

Test Plan

A number of users running 12.0 have reported panics that I believe
are caused by this bug; they've been testing this patch and have not
yet observed any new panics.

Diff Detail

Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 22059
Build 21287: arc lint + arc unit

Event Timeline

markj added a reviewer: network.

Ping? Is anyone interested in reviewing this?

I am accepting it as it seems all correct. Given you said you want an EN for this as well, you might consider to split it up into two parts so that only the real functional fix would later be part of the EN.

sys/netinet/in.c
1392

While fixing the order of statements here, I can't see how this contributes to the solution of the problem. I think you should just commit that independently.

sys/netinet6/in6.c
2338

Same as above.

This revision is now accepted and ready to land.Jan 23 2019, 9:18 PM

Here's a copy of the errata template:

After commit and MFC please fill out as much as you can and email to re@ and so@ to move the errata along.

This revision was automatically updated to reflect the committed changes.