Page MenuHomeFreeBSD

Convert all IPv4 and IPv6 multicast memberships into using a STAILQ() instead of an array
ClosedPublic

Authored by hselasky on Apr 27 2019, 3:39 PM.

Details

Summary

Using an array to keep track of multicast memberships introduce multiple races if the APIs are exercised in parallell. Moving state structures around in memory when new members are removed makes debugging more difficult. By using a STAILQ() the memory location of the IPv4 and IPv6 multicast filters become fixed and use after free issues will be more easy to track.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

@pho : Please re-test this patch with D20109 applied at the same time as this one!

hselasky updated this revision to Diff 56996.May 3 2019, 9:31 AM

Address assert found by @pho by applying proper lock.

@pho : Please re-test this patch with D20109 applied at the same time as this one!

pho added a comment.May 3 2019, 11:18 AM

I'm a bit low on free test H/W. I'll see what I can do tomorrow.

@pho: OK , just make sure you grab the latest D20080 and D20109 !

For those that review: The IPv4 changes are almost identical to the IPv6 changes! Only different structures with different name.

pho added a comment.May 3 2019, 5:36 PM

I ran the two multicast tests I have repeatedly with D20080.56996.diff and D20109.56998.diff.
No problems seen.

markj added inline comments.May 24 2019, 7:10 PM
sys/netinet/in_mcast.c
343 ↗(On Diff #56996)

Extre newline.

sys/netinet/in_var.h
270 ↗(On Diff #56762)

Indeed, I can't see why one of the _FOREACH macros cannot be used.

sys/netinet6/in6_mcast.c
2299 ↗(On Diff #56996)

Is this supposed to be IN6_MULTI_LOCK()? Note that we lock it further below.

sys/netpfil/pf/if_pfsync.c
2382 ↗(On Diff #56996)

We should probably assert that imo_head is empty following the removal?

2404 ↗(On Diff #56996)

We should wrap this in a helper routine.

hselasky added inline comments.May 24 2019, 10:45 PM
sys/netinet6/in6_mcast.c
2299 ↗(On Diff #56996)

Yes, you're right. I'll update the patch tomorrow.

hselasky marked 4 inline comments as done.May 25 2019, 9:40 AM
hselasky added inline comments.
sys/netinet/in_var.h
270 ↗(On Diff #56762)

Fixed in next patch revision.

sys/netpfil/pf/if_pfsync.c
2382 ↗(On Diff #56996)

We might just loop here. I'll fix in next patch revision.

2404 ↗(On Diff #56996)

Will be fixed in next patch version.

hselasky updated this revision to Diff 57886.May 25 2019, 9:41 AM
hselasky marked 3 inline comments as done.

Address comments from @markj .

hselasky updated this revision to Diff 57887.May 25 2019, 9:44 AM

Address comment from @markj .

markj added inline comments.May 30 2019, 3:40 PM
sys/netinet/in_mcast.c
2426 ↗(On Diff #57887)

Might as well switch it to bool and write !is_final.

sys/netinet/in_var.h
242 ↗(On Diff #57887)

Why did you make this a typedef?

248 ↗(On Diff #57887)

I can't see why all of this stuff is here and not in ip_var.h. There are some consumers that do not include ip_var.h, but they can be fixed. At least, there should be a comment here or in in_var.h explaining the relationship.

sys/netpfil/pf/if_pfsync.c
1397 ↗(On Diff #57887)

Missing space after the last comma.

markj added inline comments.Jun 3 2019, 3:55 PM
sys/netinet/in_mcast.c
1596 ↗(On Diff #57887)

There is a LOR here. We hold the non-sleepable inp lock at this point.

hselasky added inline comments.Jun 4 2019, 7:04 AM
sys/netinet/in_mcast.c
98 ↗(On Diff #57887)

@markj: The locking order is described here. Can you show a witness warning with the LOR?

markj added inline comments.Jun 4 2019, 2:45 PM
sys/netinet/in_mcast.c
98 ↗(On Diff #57887)

I haven't tried to reproduce it yet. In the code I referenced, we acquire the inp wlock, followed by the in_multi lock.

@markj: I will have a look at your comments and respond with a new patch. I'm currently a bit busy.

sys/netinet/in_mcast.c
1596 ↗(On Diff #57887)

I can probably move this locking a bit earlier to resolve that.

hselasky marked 4 inline comments as done.Jun 6 2019, 9:47 AM
hselasky added inline comments.
sys/netinet/in_var.h
242 ↗(On Diff #57887)

To make future changes easier, like changing the queue type.

hselasky updated this revision to Diff 58299.Jun 6 2019, 9:51 AM

Address comments from @markj .

hselasky updated this revision to Diff 58300.Jun 6 2019, 9:57 AM
hselasky marked an inline comment as done.

Fix some style nits pointed out by @markj .

hselasky added inline comments.Jun 6 2019, 10:02 AM
sys/netinet/in_var.h
248 ↗(On Diff #57887)

Can you explain a bit more? I believe I've already tried this and that is the reason for the #ifdef _NETINET_IN_VAR_H_ I've added. The structures must be defined in the correct order, because not always we use pointers. Else the order wouldn't matter.

hselasky updated this revision to Diff 58308.Jun 6 2019, 2:23 PM

Fix for buildworld.

markj added inline comments.Jun 6 2019, 2:51 PM
sys/netinet/in_var.h
242 ↗(On Diff #57887)

Just write STAILQ_HEAD(in_mfilter_head, in_mfilter). Then you get a struct in_mfilter_head and you can change the queue type easily.

hselasky updated this revision to Diff 58312.Jun 6 2019, 3:17 PM
hselasky marked an inline comment as done.

Update as per @markj 's comment.

markj added inline comments.Jun 17 2019, 5:45 PM
sys/netinet/in_var.h
248 ↗(On Diff #57887)

Doesn't this suggest that we should not expose the mfilter list accessors directly but instead define them in a C file as functions on a ip_moptions structure?

244 ↗(On Diff #58312)

Style: extra space after '*'.

245 ↗(On Diff #58312)

The naming is kind of weird IMO. Usually we would call these ip_mfilter_alloc(), ip_mfilter_free(), IP_MFILTER_FOREACH, etc..

sys/netinet/ip_carp.c
1476 ↗(On Diff #58312)

This line is too long, ditto below.

sys/netinet6/in6_var.h
649 ↗(On Diff #58312)

I think it would be a bit simpler to just store a count of the filters in the moptions structure.

sys/netpfil/pf/if_pfsync.c
2351 ↗(On Diff #58312)

This line is too long.

hselasky marked 6 inline comments as done.Jun 18 2019, 7:26 AM

Address comments from @markj .

sys/netinet/in_var.h
248 ↗(On Diff #57887)

Can we leave this AS-IS for now, hence it is most compatible with the old way of doing things. And then later we can move things around which also will produce more diffs?

sys/netinet6/in6_var.h
649 ↗(On Diff #58312)

The STAILQ() is already traversed several times, so there is no real benefit of keeping a separate counter. Can this change be made separately later on?

sys/netpfil/pf/if_pfsync.c
2351 ↗(On Diff #58312)

Fixed.

hselasky updated this revision to Diff 58759.Jun 18 2019, 7:27 AM
hselasky marked 3 inline comments as done.

Address comments from @markj .

hselasky marked an inline comment as done.Jun 18 2019, 7:30 AM
markj accepted this revision.Jun 19 2019, 6:14 PM

I have no objection to the latest diff, but IMO the conversion from arrays to an STAILQ should be done in a way that does not require conditional definition of struct ip_moptions in ip_var.h.

sys/netinet/in_var.h
248 ↗(On Diff #57887)

I have no objection to the current diff aside from the one nit, but I would have found this diff easier to review if the conversion from an array to STAILQ was done in a separate diff.

sys/netinet6/in6_var.h
645 ↗(On Diff #58759)

Should be IP6_MFILTER_FOREACH.

This revision is now accepted and ready to land.Jun 19 2019, 6:14 PM
hselasky marked 2 inline comments as done.Jun 19 2019, 8:39 PM
hselasky added inline comments.
sys/netinet/in_var.h
248 ↗(On Diff #57887)

OK

hselasky updated this revision to Diff 58806.Jun 19 2019, 8:41 PM
hselasky marked an inline comment as done.

Fix naming of multicast foreach.

This revision now requires review to proceed.Jun 19 2019, 8:41 PM

Please ACK. Patch will be submitted tomorrow after some additional build testing.

@markj: Thank you for spending time reviewing this patch.

markj added inline comments.Jun 19 2019, 9:40 PM
sys/netinet/ip_var.h
98 ↗(On Diff #58806)

I meant to note earlier that imo_epoch_ctx appears to be unused.

hselasky updated this revision to Diff 58824.Jun 20 2019, 5:16 AM

Remove unused epoch contexts as suggested by @markj .

markj accepted this revision.Jun 21 2019, 4:46 PM
This revision is now accepted and ready to land.Jun 21 2019, 4:46 PM