Page MenuHomeFreeBSD

Fix a number of issues with in6 ifaddr list locking.
AcceptedPublic

Authored by markj on Apr 8 2015, 5:18 AM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Apr 6, 8:44 AM
Unknown Object (File)
Jan 27 2024, 7:25 AM
Unknown Object (File)
Jan 23 2024, 1:31 AM
Unknown Object (File)
Dec 20 2023, 12:24 AM
Unknown Object (File)
Nov 16 2023, 11:49 AM
Unknown Object (File)
Nov 16 2023, 8:09 AM
Unknown Object (File)
Nov 14 2023, 8:01 AM
Unknown Object (File)
Nov 12 2023, 8:07 AM

Details

Summary

There are a few places where we iterate over the IPv6 address list
without taking the address list lock, I assume because of some
difficulties in doing so without causing lock recursion. As a result,
it's trivial to trigger a kernel panic by adding or removing IPv6
addresses concurrently. This change attempts to address the common
instances of this problem.

The main change is to nd6_timer(), which iterates over the address list
looking for expired addresses. It calls in6_purgeaddr() on such
addresses, which requires taking the address list write lock to remove
the address' entry. It may also call regen_tmpaddr() to generate a new
temporary address to replace an expired address, which also requires
taking the write lock.

A straightforward solution is to just take the write lock around the
loop and adjust in6_purgeaddr and regen_tmpaddr to avoid recursion,
perhaps by adding _locked variants. This gets somewhat hairy though, and
introduces the potential for lock contention: nd6_timer() is called once
a second by default, and the vast majority of invocations likely won't
need the write lock. Moreover, the read lock is taken in some forwarding
paths (via in6_localip()). My solution involves two changes: add a
"to-purge" list to nd6_timer(), and have the loop insert expired
addresses into it. Once we're done scanning the whole list, call
in6_purgeaddr() on each address in the to-purge list, so that we don't
take the write lock more than necessary. This requires adding a field to
in6_ifaddr; to compensate, I removed the unused ia_plen (which was
taking up 8 bytes on amd64 because of padding). The second part of this
change modifies regen_tmpaddr(): on entry, it now expects the read lock
to be held and drops it iff a new address is added, in which case it
returns unlocked. This asymmetry is a bit ugly, but regen_tmpaddr() is
static and only called from nd6_timer(), so I think it's a reasonable
compromise.

I also modified in6_unlink_ifa to fix a double free in the case that two
threads race to remove the same address: once we've acquired the in6 addr
write lock, check to see if the address has already been removed and
return early if so.

Test Plan

To trigger races, I wrote a couple of scripts that add and remove IPv6 addresses in a loop, and run several such loops concurrently. With the changes in D2253-2256, I'm no longer able to trigger panics. I also tested with temporary addresses enabled, and configured rtadvd instances on another system to advertise prefixes with very short lifetimes, so that address expiration code is exercised. I'm still working on some further tests.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
No Lint Coverage
Unit
No Test Coverage

Event Timeline

markj retitled this revision from to Fix a number of issues with in6 ifaddr list locking..
markj edited the test plan for this revision. (Show Details)
markj updated this object.
markj edited the test plan for this revision. (Show Details)
markj added reviewers: ae, hrs, bz, rwatson.
sys/netinet6/in6.c
1356

This comment now should be below. Also, maybe you should mention about the race that this loop solves?

sys/netinet6/nd6.c
638–639

This can be replaced with TAILQ_FOREACH()

sys/netinet6/nd6_rtr.c
1501 ↗(On Diff #4737)

Will not add this new LOR? nd6_dad_start() can take wlock while we hold ifaddr_rlock.

1516 ↗(On Diff #4737)

The same problem.

  • Don't use the _SAFE loop variant: addresses are purged after the loop.
  • Add a comment explaining the re-lookup.
sys/netinet6/nd6_rtr.c
1501 ↗(On Diff #4737)

Right, with the change in D2256, we would hold the DAD wlock while calling nd6_ns_output(), which can potentially result in the ifaddr rlock being reacquired.

I don't think this is easy to fix without a change like the one proposed in D903, where the neighbour solicitations for DAD would be queued up and transmitted asynchronously. For now I'm inclined to remove this part of the change, as it's mostly unrelated to the rest. Do you have any thoughts on this?

I'll try to return to patch in D903 soon. It looks like we need it not to increase nd6 code complexity..

sys/netinet6/nd6.c
666–670

I wonder why it is safe to modify ia6_expiring while holding read lock on IN6_IFADDR.
Can you please add some more comments here?

markj edited edge metadata.
  • Explain why it's safe to modify ia6_expiring in nd6_timer().
  • Revert changes to pfxlist_onlink_check().
In D2254#11, @melifaro wrote:

I'll try to return to patch in D903 soon. It looks like we need it not to increase nd6 code complexity..

Thanks! I'm happy to help test it if you like.

Any feedback on the rest of the diff? As was pointed out, we'll need some machinery to avoid LORs when calling nd6_dad_start(), but the rest of the diff is more straightforward.

gnn added a reviewer: gnn.
This revision is now accepted and ready to land.Dec 30 2015, 8:58 PM

I don't think you caught all cases. Let me check a working tree as there was an IPv4 related race and a net/if.c related one on the address list as well at one point.

Also I want to double check that you cannot run into recursive locks currently.

In short: I think this is a good start but not the full fix.

sys/netinet6/nd6.c
119

unrelated noise, can we do cleanups in a separate commit?

In D2254#100542, @bz wrote:

I don't think you caught all cases. Let me check a working tree as there was an IPv4 related race and a net/if.c related one on the address list as well at one point.

Oh, that's definitely true. This was an attempt to fix some of the panics I hit with a couple of trivial repro scripts to add and remove addresses. It's not complete by any means.

Also I want to double check that you cannot run into recursive locks currently.

In short: I think this is a good start but not the full fix.

Mark:

Want me to commandeer and commit this? Our testing is going very positive at work.

Mark:

Want me to commandeer and commit this? Our testing is going very positive at work.

How about I commit this in one week if no one objects? Again, I realize that this patch is incomplete, but it's relatively simple and addresses some nasty use-after-free scenarios.

I'd commit it now to be fair. We've been testing this in prod and are very happy with it.

sys/netinet6/in6_var.h
139

Do we actually need these struct changes or would a function local list on a struct just holding in6_ifaddr *s work as well given it's a single location? That would also remove any locking issue confusions.

sys/netinet6/nd6.c
803

Is this a PB display problem or is indentation not right here?

sys/netinet6/in6_var.h
139

Yup, that could be done. The only issue is that we'd have to malloc(M_NOWAIT) entries for such a list. If the malloc() failures, I suppose we could just skip the entry, since we'll visit it again on the next scan one second later. Does this seem reasonable?

sys/netinet6/nd6.c
119

Sure.

803

I've always thought that BSD style is to de-indent long string literals instead of breaking them across multiple lines. This way, one can reliably grep for it.