Page MenuHomeFreeBSD

if_tuntap: defer transient destroy_dev() to a taskqueue
ClosedPublic

Authored by kevans on Oct 29 2025, 3:19 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Nov 29, 5:09 PM
Unknown Object (File)
Thu, Nov 20, 5:52 AM
Unknown Object (File)
Mon, Nov 17, 7:26 PM
Unknown Object (File)
Sun, Nov 16, 8:28 PM
Unknown Object (File)
Sat, Nov 15, 8:01 AM
Unknown Object (File)
Tue, Nov 11, 3:48 PM
Unknown Object (File)
Tue, Nov 11, 11:53 AM
Unknown Object (File)
Tue, Nov 11, 7:05 AM

Details

Summary

We're in the dtor, so we can't destroy it now without deadlocking after
recent changes to make destroy_dev() provide a barrier. However, we
know there isn't any other dtor to run, so we can go ahead and clean up
our state and just prevent a use-after-free if someone races to open
the device while we're trying to destroy it.

While we're here, allow a destroy operation to proceed if we caught a
signal in cv_wait_sig() but tun_busy dropped to 0 while we were waiting
to acquire the lock.

This was more of an inherent design flaw, then a bug in the below-refed
commit.

PR: 290575
Fixes: 4dbe6628179d ("devfs: make destroy_dev() a release [...]")

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

sys/net/if_tuntap.c
680
1167

What prevents a scenario where tunopen() is entered and the check for si_drv1 == dev to fail because a parallel tun_destroy() not yet set si_drv1 to dev. Meantime tun_destroy() progressed to the point that si_drv1 is no longer a pointer to tuntap_softc, and we read it there?

kevans marked an inline comment as done.

Dance around tunmtx to protect against concurrent tun_destroy() / tunopen()

destroy_dev_drain() after we've issued all of our tun_destroy(), just in case
the module unload destroyed a transient tunnel. I note that module unload does
not actually work at the moment, though, since the stub tun_mod/tap_mod use
a NULL modevent handler, which means modevent_nop / EBUSY a MOD_UNLOAD. This is
an easy fix, but out of scope for this review.

sys/net/if_tuntap.c
1179

I do not understand this even more. If everything you do under the tunmtx is checking the flag, then you can check it without lock as well.

Could it become simpler if we made two observations:

  • if you need to call destroy_dev() from cdevpriv destructor, then the destructor itself was not called in the context of destroy_dev(), but instead due to the close of a file
  • after destroy_dev_sched(), no new threads could enter any csw methods. Existing threads continue to run, of course.
sys/net/if_tuntap.c
1179

Oh, right- I missed that your point #1 is implemented by way of dev_refthread() checking CDP_SCHED_DTR... I'll chew on this a little bit more, because I don't immediately see how I drain the theoretical remaining thread in tunopen(), unless I switch it to something more like an epoch with a deferred-free.

Simplify a bit using the net epoch

Enter the net epoch in tunopen() and use that fact to defer reclamation of just
the bits needed by a concurrent open. NET_EPOCH_DRAIN_CALLBACKS() in our
SYSUNINIT now that the dtor might have scheduled a callback to free a softc.

sys/net/if_tuntap.c
1167

So I still do not understand why what I asked above is not possible, with the new code. You ought to explain it, perhaps as a comment to make it easily visible.

sys/net/if_tuntap.c
1167

Sure, let me explain here to be sure it's actually correct (and it may be that I'm missing a barrier around si_drv1 load/store to actually guarantee the ordering that I want): tun_destroy() sets si_drv1 = dev and subsequently schedules the softc to be freed in coordination with the net epoch, so that a parallel thread in tunopen can observe one of two states:

  • si_drv1 == dev: the softc will be freed imminently
  • si_drv1 != dev: we're in the middle of an epoch section, so even if we just missed the store to si_drv1 the free will still be deferred until sometime after we exited. The tunnel is TUN_DYING, so we'll EBUSY our way out of the epoch.

Still, add the comment please.

This revision is now accepted and ready to land.Nov 3 2025, 3:38 AM
sys/net/if_tuntap.c
1167

Don't you at least need read-once semantics for dev->si_drv1? That is, how do you know that the value tp is a valid pointer to the softc? It looks like this should be:

void *p;

p = atomic_load_ptr(&dev->si_drv1);
if (p == dev)
    return error;
...
tp = p;

Do a read-once load of si_drv1, just in case. Amend the comment to try and
describe our solution more completely

This revision now requires review to proceed.Nov 4 2025, 6:14 AM
This revision is now accepted and ready to land.Nov 4 2025, 1:26 PM

EPOCH_ENTER_PREEMPT() should be at least as strong as compiler membarrier, and it is indeed due to use of critical enter, so I do not think that si_drv1 can be re-loaded outside the epoch region. It does not hurt of course to use explicit load_ptr().

In D53438#1223300, @kib wrote:

EPOCH_ENTER_PREEMPT() should be at least as strong as compiler membarrier, and it is indeed due to use of critical enter, so I do not think that si_drv1 can be re-loaded outside the epoch region. It does not hurt of course to use explicit load_ptr().

I don't follow. Previously, both loads of dev->si_drv1 happened after entering an epoch read section.

The actions that are prevented by atomic_load_ptr() are basically arbitrary reloads of si_drv1 at places where the tp value is needed. Epoch, being fenced by compiler membar, already disallows that.

In D53438#1223311, @kib wrote:

The actions that are prevented by atomic_load_ptr() are basically arbitrary reloads of si_drv1 at places where the tp value is needed. Epoch, being fenced by compiler membar, already disallows that.

The flow of the code is:

epoch_enter();
load si_drv1 and check it
...
load si_drv1 and use it
...
epoch_exit();

so I don't see how epoch is relevant.

In D53438#1223311, @kib wrote:

The actions that are prevented by atomic_load_ptr() are basically arbitrary reloads of si_drv1 at places where the tp value is needed. Epoch, being fenced by compiler membar, already disallows that.

The flow of the code is:

epoch_enter();
load si_drv1 and check it
...
load si_drv1 and use it
...
epoch_exit();

so I don't see how epoch is relevant.

epoch_enter();
load si_drv1 into p and check it
...
tp = p;
...
epoch_exit();
use tp
In D53438#1223367, @kib wrote:
In D53438#1223311, @kib wrote:

The actions that are prevented by atomic_load_ptr() are basically arbitrary reloads of si_drv1 at places where the tp value is needed. Epoch, being fenced by compiler membar, already disallows that.

The flow of the code is:

epoch_enter();
load si_drv1 and check it
...
load si_drv1 and use it
...
epoch_exit();

so I don't see how epoch is relevant.

epoch_enter();
load si_drv1 into p and check it
...
tp = p;
...
epoch_exit();
use tp

The code was previously written like this:

NET_EPOCH_ENTER(et);
if (dev->si_drv1 == dev) {
        NET_EPOCH_EXIT(et);
        return (ENXIO);
}
...
tp = dev->si_drv1;
TUN_LOCK(tp);
...
NET_EPOCH_EXIT();

So what happens if tp == dev and we call TUN_LOCK(tp)?

sys/net/if_tuntap.c
686

I think my other concern here is: do I need to be paranoid enough to drop a release fence here to guarantee order between this and the later NET_EPOCH_CALL? I particular, I'm wondering if the CPU could manage to reorder this past both that and the destroy_dev_sched badly enough that one could enter the net epoch at just the wrong time to observe a state where it's being freed without this store having completed, but maybe epoch_call semantics mean that it isn't a realistic scenario

sys/net/if_tuntap.c
686

Actually, I guess there's some locking in there that probably makes it not a realistic scenario

In D53438#1223367, @kib wrote:
In D53438#1223311, @kib wrote:

The actions that are prevented by atomic_load_ptr() are basically arbitrary reloads of si_drv1 at places where the tp value is needed. Epoch, being fenced by compiler membar, already disallows that.

The flow of the code is:

epoch_enter();
load si_drv1 and check it
...
load si_drv1 and use it
...
epoch_exit();

so I don't see how epoch is relevant.

epoch_enter();
load si_drv1 into p and check it
...
tp = p;
...
epoch_exit();
use tp

The code was previously written like this:

NET_EPOCH_ENTER(et);
if (dev->si_drv1 == dev) {
        NET_EPOCH_EXIT(et);
        return (ENXIO);
}
...
tp = dev->si_drv1;
TUN_LOCK(tp);
...
NET_EPOCH_EXIT();

So what happens if tp == dev and we call TUN_LOCK(tp)?

Ok, I agree.AA

sys/net/if_tuntap.c
686

Yes, you can rely that we take and drop (giving rel semantic) dev_mtx in any form of destroy_dev().

sys/net/if_tuntap.c
686

I would perhaps use an atomic_store_ptr() here to complement the load and signal that the store is unsynchronized, but I agree that you don't need a fence.

sys/net/if_tuntap.c
686

OK- will turn it into an atomic_store_ptr before pushing