Details

Reviewers

kib
markj

Commits

rG96c1d8db39df: if_tuntap: defer transient destroy_dev() to a taskqueue

Summary

We're in the dtor, so we can't destroy it now without deadlocking after
recent changes to make destroy_dev() provide a barrier. However, we
know there isn't any other dtor to run, so we can go ahead and clean up
our state and just prevent a use-after-free if someone races to open
the device while we're trying to destroy it.

While we're here, allow a destroy operation to proceed if we caught a
signal in cv_wait_sig() but tun_busy dropped to 0 while we were waiting
to acquire the lock.

This was more of an inherent design flaw, then a bug in the below-refed
commit.

PR: 290575
Fixes: 4dbe6628179d ("devfs: make destroy_dev() a release [...]")

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

kevans created this revision.Oct 29 2025, 3:19 PM

Herald added subscribers: glebius, melifaro, ae, imp. · View Herald TranscriptOct 29 2025, 3:19 PM

kevans requested review of this revision.Oct 29 2025, 3:19 PM

Harbormaster completed remote builds in B68224: Diff 165327.Oct 29 2025, 3:19 PM

kevans added a child revision: D53439: makedev(9): drop an additional note about cdevpriv dtors.Oct 29 2025, 3:19 PM

emaste added a subscriber: emaste.Oct 29 2025, 3:30 PM

kib added inline comments.Oct 29 2025, 4:48 PM

sys/net/if_tuntap.c
680
1167	What prevents a scenario where tunopen() is entered and the check for si_drv1 == dev to fail because a parallel tun_destroy() not yet set si_drv1 to dev. Meantime tun_destroy() progressed to the point that si_drv1 is no longer a pointer to tuntap_softc, and we read it there?

Dance around tunmtx to protect against concurrent tun_destroy() / tunopen()

destroy_dev_drain() after we've issued all of our tun_destroy(), just in case
the module unload destroyed a transient tunnel. I note that module unload does
not actually work at the moment, though, since the stub tun_mod/tap_mod use
a NULL modevent handler, which means modevent_nop / EBUSY a MOD_UNLOAD. This is
an easy fix, but out of scope for this review.

Harbormaster completed remote builds in B68254: Diff 165374.Oct 30 2025, 1:59 AM

kib added inline comments.Oct 30 2025, 9:03 AM

sys/net/if_tuntap.c
1179	I do not understand this even more. If everything you do under the tunmtx is checking the flag, then you can check it without lock as well. Could it become simpler if we made two observations: if you need to call destroy_dev() from cdevpriv destructor, then the destructor itself was not called in the context of destroy_dev(), but instead due to the close of a file after destroy_dev_sched(), no new threads could enter any csw methods. Existing threads continue to run, of course.

kevans added inline comments.Nov 1 2025, 6:20 PM

sys/net/if_tuntap.c
1179	Oh, right- I missed that your point #1 is implemented by way of dev_refthread() checking `CDP_SCHED_DTR`... I'll chew on this a little bit more, because I don't immediately see how I drain the theoretical remaining thread in tunopen(), unless I switch it to something more like an epoch with a deferred-free.

Simplify a bit using the net epoch

Enter the net epoch in tunopen() and use that fact to defer reclamation of just
the bits needed by a concurrent open. NET_EPOCH_DRAIN_CALLBACKS() in our
SYSUNINIT now that the dtor might have scheduled a callback to free a softc.

Harbormaster completed remote builds in B68351: Diff 165674.Nov 2 2025, 3:02 PM

kib added inline comments.Nov 3 2025, 12:19 AM

sys/net/if_tuntap.c
1167	So I still do not understand why what I asked above is not possible, with the new code. You ought to explain it, perhaps as a comment to make it easily visible.

kevans added inline comments.Nov 3 2025, 1:56 AM

sys/net/if_tuntap.c
1167	Sure, let me explain here to be sure it's actually correct (and it may be that I'm missing a barrier around `si_drv1` load/store to actually guarantee the ordering that I want): tun_destroy() sets `si_drv1 = dev` and subsequently schedules the softc to be freed in coordination with the net epoch, so that a parallel thread in tunopen can observe one of two states: `si_drv1 == dev`: the softc will be freed imminently `si_drv1 != dev`: we're in the middle of an epoch section, so even if we just missed the store to `si_drv1` the free will still be deferred until sometime after we exited. The tunnel is `TUN_DYING`, so we'll EBUSY our way out of the epoch.

Still, add the comment please.

This revision is now accepted and ready to land.Nov 3 2025, 3:38 AM

markj added inline comments.Nov 3 2025, 6:22 PM

sys/net/if_tuntap.c
1167	Don't you at least need read-once semantics for `dev->si_drv1`? That is, how do you know that the value `tp` is a valid pointer to the softc? It looks like this should be: void *p; p = atomic_load_ptr(&dev->si_drv1); if (p == dev) return error; ... tp = p;

Do a read-once load of si_drv1, just in case. Amend the comment to try and
describe our solution more completely

This revision now requires review to proceed.Nov 4 2025, 6:14 AM

Harbormaster completed remote builds in B68401: Diff 165795.Nov 4 2025, 6:14 AM

markj accepted this revision.Nov 4 2025, 1:26 PM

This revision is now accepted and ready to land.Nov 4 2025, 1:26 PM

EPOCH_ENTER_PREEMPT() should be at least as strong as compiler membarrier, and it is indeed due to use of critical enter, so I do not think that si_drv1 can be re-loaded outside the epoch region. It does not hurt of course to use explicit load_ptr().

In D53438#1223300, @kib wrote:

EPOCH_ENTER_PREEMPT() should be at least as strong as compiler membarrier, and it is indeed due to use of critical enter, so I do not think that si_drv1 can be re-loaded outside the epoch region. It does not hurt of course to use explicit load_ptr().

I don't follow. Previously, both loads of dev->si_drv1 happened after entering an epoch read section.

The actions that are prevented by atomic_load_ptr() are basically arbitrary reloads of si_drv1 at places where the tp value is needed. Epoch, being fenced by compiler membar, already disallows that.

In D53438#1223311, @kib wrote:

The actions that are prevented by atomic_load_ptr() are basically arbitrary reloads of si_drv1 at places where the tp value is needed. Epoch, being fenced by compiler membar, already disallows that.

The flow of the code is:

epoch_enter();
load si_drv1 and check it
...
load si_drv1 and use it
...
epoch_exit();

so I don't see how epoch is relevant.

In D53438#1223362, @markj wrote:
In D53438#1223311, @kib wrote:

The actions that are prevented by atomic_load_ptr() are basically arbitrary reloads of si_drv1 at places where the tp value is needed. Epoch, being fenced by compiler membar, already disallows that.

The flow of the code is:
epoch_enter();
load si_drv1 and check it
...
load si_drv1 and use it
...
epoch_exit();
so I don't see how epoch is relevant.

epoch_enter();
load si_drv1 into p and check it
...
tp = p;
...
epoch_exit();
use tp

In D53438#1223367, @kib wrote:
In D53438#1223362, @markj wrote:
In D53438#1223311, @kib wrote:

The actions that are prevented by atomic_load_ptr() are basically arbitrary reloads of si_drv1 at places where the tp value is needed. Epoch, being fenced by compiler membar, already disallows that.

The flow of the code is:
epoch_enter();
load si_drv1 and check it
...
load si_drv1 and use it
...
epoch_exit();
so I don't see how epoch is relevant.
epoch_enter();
load si_drv1 into p and check it
...
tp = p;
...
epoch_exit();
use tp

The code was previously written like this:

NET_EPOCH_ENTER(et);
if (dev->si_drv1 == dev) {
        NET_EPOCH_EXIT(et);
        return (ENXIO);
}
...
tp = dev->si_drv1;
TUN_LOCK(tp);
...
NET_EPOCH_EXIT();

So what happens if tp == dev and we call TUN_LOCK(tp)?

kevans added inline comments.Nov 4 2025, 6:11 PM

sys/net/if_tuntap.c
686	I think my other concern here is: do I need to be paranoid enough to drop a release fence here to guarantee order between this and the later NET_EPOCH_CALL? I particular, I'm wondering if the CPU could manage to reorder this past both that and the destroy_dev_sched badly enough that one could enter the net epoch at just the wrong time to observe a state where it's being freed without this store having completed, but maybe epoch_call semantics mean that it isn't a realistic scenario

kevans added inline comments.Nov 4 2025, 6:15 PM

sys/net/if_tuntap.c
686	Actually, I guess there's some locking in there that probably makes it not a realistic scenario

In D53438#1223368, @markj wrote:
In D53438#1223367, @kib wrote:
In D53438#1223362, @markj wrote:
In D53438#1223311, @kib wrote:

The actions that are prevented by atomic_load_ptr() are basically arbitrary reloads of si_drv1 at places where the tp value is needed. Epoch, being fenced by compiler membar, already disallows that.

The flow of the code is:
epoch_enter();
load si_drv1 and check it
...
load si_drv1 and use it
...
epoch_exit();
so I don't see how epoch is relevant.
epoch_enter();
load si_drv1 into p and check it
...
tp = p;
...
epoch_exit();
use tp
The code was previously written like this:
NET_EPOCH_ENTER(et);
if (dev->si_drv1 == dev) {
        NET_EPOCH_EXIT(et);
        return (ENXIO);
}
...
tp = dev->si_drv1;
TUN_LOCK(tp);
...
NET_EPOCH_EXIT();
So what happens if tp == dev and we call TUN_LOCK(tp)?

Ok, I agree.AA

sys/net/if_tuntap.c
686	Yes, you can rely that we take and drop (giving rel semantic) dev_mtx in any form of destroy_dev().

markj added inline comments.Nov 4 2025, 10:18 PM

sys/net/if_tuntap.c
686	I would perhaps use an atomic_store_ptr() here to complement the load and signal that the store is unsynchronized, but I agree that you don't need a fence.

kevans added inline comments.Nov 4 2025, 10:20 PM

sys/net/if_tuntap.c
686	OK- will turn it into an `atomic_store_ptr` before pushing

Closed by commit rG96c1d8db39df: if_tuntap: defer transient destroy_dev() to a taskqueue (authored by kevans). · Explain WhyNov 5 2025, 12:28 AM

This revision was automatically updated to reflect the committed changes.

kevans added a commit: rG96c1d8db39df: if_tuntap: defer transient destroy_dev() to a taskqueue.

if_tuntap: defer transient destroy_dev() to a taskqueue
ClosedPublic
Actions

Details

Diff Detail

Event Timeline

Revision Contents
Changeset List

Diff 165853

sys/net/if_tuntap.c

if_tuntap: defer transient destroy_dev() to a taskqueueClosedPublicActions

Details

Diff Detail

Event Timeline

Revision ContentsChangeset List

Diff 165853

sys/net/if_tuntap.c

if_tuntap: defer transient destroy_dev() to a taskqueue
ClosedPublic
Actions

Revision Contents
Changeset List