Page MenuHomeFreeBSD

add a method to defer destruction of if_softc to if_destroy
Needs RevisionPublic

Authored by avg on Mon, Dec 3, 1:23 PM.

Details

Summary

This change should allow to close a race between the code that frees
if_softc immediately after calling if_free(), such as
lagg_clone_destroy() for instance, and other accesses to the softc via a
reference to the interface that was obtained before the interface was
marked as dying.

Test Plan

To do.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Lint OK
Unit
No Unit Test Coverage
Build Status
Buildable 21330
Build 20660: arc lint + arc unit

Event Timeline

avg created this revision.Mon, Dec 3, 1:23 PM
bz requested changes to this revision.Mon, Dec 3, 4:12 PM
bz added a subscriber: bz.

Sorry but my understanding is that this could possibly free the softc even before lagg_clone_destroy() has finished, couldn't it?

That main problem here is, I guess, the way the epoch(9)ification was done and not thought to the end. I'd rather go and see about the problem there in whatever calls if_destroy.

This revision now requires changes to proceed.Mon, Dec 3, 4:12 PM
avg added a comment.Mon, Dec 3, 4:33 PM
In D18420#392053, @bz wrote:

Sorry but my understanding is that this could possibly free the softc even before lagg_clone_destroy() has finished, couldn't it?

No, it cannot. There is always at least one interface reference held by the caller of lagg_clone_destroy().

That main problem here is, I guess, the way the epoch(9)ification was done and not thought to the end. I'd rather go and see about the problem there in whatever calls if_destroy.

I do not think that this issue is related to the epoch-ification. It certainly predates it. We ran into it using CURRENT from before the epoch-ification.
I attempted to provide a more detailed explanation of my understanding of the problem here: https://docs.freebsd.org/cgi/getmsg.cgi?fetch=40076+0+current/freebsd-net
Note that epoch is not mentioned at all there.

ae added a comment.Tue, Dec 4, 10:47 AM

Can you provide an example, what another driver needs such method? Maybe it is enough to fix the problem locally in if_lagg for first time?

ae added a comment.Tue, Dec 4, 10:56 AM

Looking at your description here, it seems if_bridge can be affected by this issue.

avg added a comment.Tue, Dec 4, 11:03 AM
In D18420#392349, @ae wrote:

Can you provide an example, what another driver needs such method? Maybe it is enough to fix the problem locally in if_lagg for first time?

I think that if you grep for if_clone_simple in sys/net and pick any result, then most likely it would be affected.
For example, if_tap.c and SIOCGIFSTATUS. tapifioctl() accesses if_softc while tap_clone_destroy -> tap_destroy can free it.

ae added a comment.EditedTue, Dec 4, 11:25 AM

I'm inclined to agree. For tunneling interfaces I used global per-driver sx lock to prevent concurrent ioctl invocation. I think these drivers are not affected by this problem.

avg added a comment.Tue, Dec 4, 12:13 PM
In D18420#392363, @ae wrote:

I'm inclined to agree. For tunneling interfaces I used global per-driver sx lock to prevent concurrent ioctl invocation. I think these drivers are not affected by this problem.

Do you mean how, for example, gif_ioctl_sx is used in if_gif.c?
I agree that that solves the problem.
Maybe it's easier to go that way in other drivers too.

ae accepted this revision.Thu, Dec 6, 11:59 AM

I think such method can be useful. Do you plan to merge it?

bz added a comment.Thu, Dec 6, 4:37 PM
In D18420#392922, @ae wrote:

I think such method can be useful. Do you plan to merge it?

How would you merge it without breaking the ifnet KBI? We've got 3rd party network interfaces in ports, etc...

ae added a comment.Thu, Dec 6, 8:08 PM

How would you merge it without breaking the ifnet KBI? We've got 3rd party network interfaces in ports, etc...

There is several spare fields, that can be used.

avg added a comment.Sat, Dec 15, 1:46 PM

At work we decided to use the approach of if_gif / if_gre.
So, we don't really need this change.
But if people find it useful (e.g. @ae) and no one objects (looking at @bz), then I would like to commit this change and to merge if we have a spare field.