Page MenuHomeFreeBSD

add a method to defer destruction of if_softc to if_destroy
AbandonedPublic

Authored by avg on Dec 3 2018, 1:23 PM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, Jan 7, 3:02 AM
Unknown Object (File)
Oct 24 2024, 4:35 PM
Unknown Object (File)
Oct 3 2024, 10:50 PM
Unknown Object (File)
Sep 7 2024, 8:09 PM
Unknown Object (File)
Sep 7 2024, 4:12 PM
Unknown Object (File)
Aug 31 2024, 12:10 AM
Unknown Object (File)
Aug 18 2024, 10:33 AM
Unknown Object (File)
Jul 31 2024, 8:58 PM

Details

Summary

This change should allow to close a race between the code that frees
if_softc immediately after calling if_free(), such as
lagg_clone_destroy() for instance, and other accesses to the softc via a
reference to the interface that was obtained before the interface was
marked as dying.

The race is detailed in this bug report: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=234135

Test Plan

To do.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 21330
Build 20660: arc lint + arc unit

Event Timeline

bz requested changes to this revision.Dec 3 2018, 4:12 PM
bz added a subscriber: bz.

Sorry but my understanding is that this could possibly free the softc even before lagg_clone_destroy() has finished, couldn't it?

That main problem here is, I guess, the way the epoch(9)ification was done and not thought to the end. I'd rather go and see about the problem there in whatever calls if_destroy.

This revision now requires changes to proceed.Dec 3 2018, 4:12 PM
In D18420#392053, @bz wrote:

Sorry but my understanding is that this could possibly free the softc even before lagg_clone_destroy() has finished, couldn't it?

No, it cannot. There is always at least one interface reference held by the caller of lagg_clone_destroy().

That main problem here is, I guess, the way the epoch(9)ification was done and not thought to the end. I'd rather go and see about the problem there in whatever calls if_destroy.

I do not think that this issue is related to the epoch-ification. It certainly predates it. We ran into it using CURRENT from before the epoch-ification.
I attempted to provide a more detailed explanation of my understanding of the problem here: http://docs.freebsd.org/cgi/mid.cgi?b6d2139b-b6b1-4051-01bc-618565934556
Note that epoch is not mentioned at all there.

Can you provide an example, what another driver needs such method? Maybe it is enough to fix the problem locally in if_lagg for first time?

Looking at your description here, it seems if_bridge can be affected by this issue.

In D18420#392349, @ae wrote:

Can you provide an example, what another driver needs such method? Maybe it is enough to fix the problem locally in if_lagg for first time?

I think that if you grep for if_clone_simple in sys/net and pick any result, then most likely it would be affected.
For example, if_tap.c and SIOCGIFSTATUS. tapifioctl() accesses if_softc while tap_clone_destroy -> tap_destroy can free it.

I'm inclined to agree. For tunneling interfaces I used global per-driver sx lock to prevent concurrent ioctl invocation. I think these drivers are not affected by this problem.

In D18420#392363, @ae wrote:

I'm inclined to agree. For tunneling interfaces I used global per-driver sx lock to prevent concurrent ioctl invocation. I think these drivers are not affected by this problem.

Do you mean how, for example, gif_ioctl_sx is used in if_gif.c?
I agree that that solves the problem.
Maybe it's easier to go that way in other drivers too.

I think such method can be useful. Do you plan to merge it?

In D18420#392922, @ae wrote:

I think such method can be useful. Do you plan to merge it?

How would you merge it without breaking the ifnet KBI? We've got 3rd party network interfaces in ports, etc...

How would you merge it without breaking the ifnet KBI? We've got 3rd party network interfaces in ports, etc...

There is several spare fields, that can be used.

At work we decided to use the approach of if_gif / if_gre.
So, we don't really need this change.
But if people find it useful (e.g. @ae) and no one objects (looking at @bz), then I would like to commit this change and to merge if we have a spare field.