Page MenuHomeFreeBSD

Make sure the multicast release tasks are properly drained when destroying VNET and IFNET
ClosedPublic

Authored by hselasky on May 19 2020, 10:33 AM.

Details

Summary

Make sure the multicast release tasks are properly drained when
destroying a VNET or a network interface.

Else the inm release tasks, both IPv4 and IPv6 may cause a panic
accessing a freed VNET or network interface.

MFC after: 1 week
Sponsored by: Mellanox Technologies

Diff Detail

Repository
rS FreeBSD src repository
Lint
Lint Skipped
Unit
Unit Tests Skipped
Build Status
Buildable 31175

Event Timeline

sys/netinet/in_mcast.c
254

Is this ordered 'enough' to ensure that the drain happens after all other stuff ?

sys/netinet6/in6_mcast.c
523

This is an unrelated fix, right ?

sys/netinet/in_mcast.c
254

Testing is ongoing. From what I can see this is good enough.

sys/netinet6/in6_mcast.c
523

Yes, minor nit while at it. SYSINITs should take a pointer argument.

I had left comments on the mailing list discussions; I'll just add pointers here.

https://lists.freebsd.org/pipermail/freebsd-net/2020-July/056460.html
https://lists.freebsd.org/pipermail/freebsd-net/2020-July/056478.html

It would help a lot to understand why the teardown work has to be deferred on vnet shutdown to understand why we have to add an extra task on top to block and wait for it to finish rather than just rip it down?

@bz: I didn't see those comments yet. Thanks for the pointer.

@bz wrote:

Wow, did I miss that back then? Did I review a change and not notice?
Sorry if that was the case.

Vnet teardown is blocking and forceful.
Doing deferred cleanup work isn’t a good idea at all.
I think that is the real problem here.

I’d rather have us fix this than putting more bandaids into the code.

@bz wrote:

That’ll probably work; still, the deferred teardown work seems wrong
to me; I haven’t investigated; the patch kind-of says exactly that
as well: if “wait until deferred stuff is done” is all we are doing,
why can we not do it on the spot then?

hselasky marked an inline comment as done.

Not really; the change (once you remove all the noise of renaming things which all should have been committed separately) doesn't justify the taskqueue anywhere; the only (extra!) locks are for the free list management; apart from that there seem to be no locking implications on freeing at the end, so that could always be done after unlock in inline if I am not wrong but it is hard to tell now about 30 commits later.

Your chnage seems to be the right thing for the current state of code; the current state of code however I believe is wrong. Go ahead and fix the immediate panic but could you do me a favour and leave an "XXX-BZ FIXME, see D24914" somewhere next to the two "VNET_SYSUNINIT" introductions; or rather, let me know once you commit this and I'll add them myself; probably cleaner that way.

This revision was not accepted when it landed; it landed in state Needs Review.Aug 10 2020, 10:46 AM
This revision was automatically updated to reflect the committed changes.

Thanks for committing this. I've run the patch for a while, and it solves my panic, and didn't introduce any others.