Page MenuHomeFreeBSD

if_tuntap: Try to fix device refcount bugs
ClosedPublic

Authored by markj on Jul 25 2025, 2:05 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Oct 11, 7:29 AM
Unknown Object (File)
Sat, Oct 11, 7:29 AM
Unknown Object (File)
Sat, Oct 11, 7:29 AM
Unknown Object (File)
Sat, Oct 11, 7:29 AM
Unknown Object (File)
Sat, Oct 11, 7:29 AM
Unknown Object (File)
Sat, Oct 11, 12:11 AM
Unknown Object (File)
Wed, Sep 24, 2:20 AM
Unknown Object (File)
Sep 18 2025, 11:04 AM

Details

Summary

There are two ways to create a tun device, via devfs cloning or with
ifconfig. The latter is implemented by tun_clone_create() and the
former by tunclone(), which invokes tun_clone_create() via
if_clone_create(). Both of these functions were invoking dev_ref()
after creating the devfs_node(), but this was extraneous. tunclone()
does need to acquire an extra reference since this is required by the
dev_clone EVENTHANDLER interface contract, but it was already doing so
by specifying MAKEDEV_REF. Fix this by removing unnecessary refcount
acquisitions.

A second problem is with teardown in a VNET jail. A tun interface
created by device cloning will hold a credential reference for the jail,
which prevents it from being destroyed and in particular prevents VNET
SYSUNINITs from running. To fix this, we need to register a
PR_METHOD_REMOVE callback for jail teardown which, in a VNET jail,
destroys cloned interfaces. This way, jail teardown can proceed.

These bugs are noticeable with something like

  1. jail -c name=test vnet command=ls /dev/tun
  2. jls -vd

Without this patch, the jail gets stuck in the DYING state.

While here, add some comments and be sure to destroy a nascent mutex and
CV in an error path.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

markj requested review of this revision.Jul 25 2025, 2:05 PM
sys/net/if_tuntap.c
830

The dev_clone() callers of make_dev() must use MAKEDEF_REF, since otherwise device might be exposed and dereferenced before the dev_clone() method has a chance to increment the refcounter.

sys/net/if_tuntap.c
610–613

I shouldn't have changed the control flow here. We should still call dev_ref() unconditionally.

830

I don't really follow. I understand it like this:

  • make_dev_s() returns a device with refcount 1, released when the tun device is destroyed
  • the dev_clone caller expects the returned device to have an extra reference, which the caller consumes

So, tunclone() should return with the cdev refcount equal to 2. Without this patch, it sets MAKEDEV_REF _and_ tunclone() calls dev_ref(), so it returns with si_refcount == 3. So, I removed the MAKEDEV_REF flag. In particular, having it be conditional on cr != NULL is confusing, it seems better to just use an explicit dev_ref() with a comment.

Unconditionally acquire a ref in tunclone().

sys/net/if_tuntap.c
830

make_dev() returning is too late. The problem is that destroy_dev() can legitimately race with creation, from the moment the pointer to the returned cdev pointer is published, and the moment make_dev() drops the dev_mtx, the only reference on the device might be gone.

I do believe that this is relevant for the tun case, because tun is registered as clone (in devfs sense), and clones might be cleaned up async. It could be even worse, because normally publishing the cdev pointer is the last action of the larger object creation, but tunclone() also does if_clone_create() which might run in parallel with destroy_dev().

  • In tunclone() only acquire an extra ref if the device already existed.
  • Re-add the MAKEDEV_REF flag.
This revision is now accepted and ready to land.Jul 28 2025, 3:48 PM
This revision was automatically updated to reflect the committed changes.