When adding a new default router to an nd_prefix, it is possible
that pfxrtr_add() will be called just after another thread has
called nd6_prefix_unlink() in preparation for freeing it. Adding
the router at this point will lead to a reference on the router
structure being lost (or a KASSERT firing with INVARIANTS).
Prevent this by marking nd_prefixes that are about to be freed
with a dying flag and not adding routers to a dying prefix.
Details
- Reviewers
markj
Diff Detail
- Lint
Lint Passed - Unit
No Test Coverage - Build Status
Buildable 27283 Build 25544: arc lint + arc unit
Event Timeline
I think this is fine. I spent some time trying to see if we could change nd6_prefix_unlink() to also remove the list of advertising routers, since that would eliminate the window in which the race can occur. That approach looks more difficult though.
Do you happen to have a test case for this? We don't have any ND6 regression tests at the moment. :(
sys/netinet6/nd6.h | ||
---|---|---|
219 | Feel free to keep _DYING if you prefer it, but I think NDPRF_UNLINKED is a more descriptive name. (Or you could invert the sense of the flag and call it NDPRF_LINKED.) |
Unfortunately I don't. Somebody at work had a VM that was reproducing it frequently, and we didn't really understand why. It may have been due to a scheduling/overload problem on the host system, but that's just speculation.
sys/netinet6/nd6.h | ||
---|---|---|
219 | do you think that there's a risk that ONLINK and UNLINKED are too similar? I don't like the idea of the LINKED flag because then I have to analyze the lifetime of these objects to ensure it's always set on creation. |
sys/netinet6/nd6.h | ||
---|---|---|
219 | Yeah, UNLINKED isn't ideal either. For LINKED, you really only need to look at places where the prefix is added to the prefix list, and it's at that point that you'd set the flag. I don't think a lot of analysis is really needed there. Anyway, my previous comments were just suggestions. I think the current diff is reasonable. |
sys/netinet6/nd6_rtr.c | ||
---|---|---|
1140 | The logic here is backwards. It should be || ... & DYING != 0 i.e. if this prefix is about to be deleted, skip making modifications. |