Page MenuHomeFreeBSD

tun/tap: correct ref count on cloned cdevs
ClosedPublic

Authored by kib on Sep 28 2023, 10:20 AM.
Tags
None
Referenced Files
Unknown Object (File)
Dec 11 2024, 12:48 PM
Unknown Object (File)
Nov 15 2024, 3:44 PM
Unknown Object (File)
Nov 15 2024, 2:18 PM
Unknown Object (File)
Nov 15 2024, 2:12 PM
Unknown Object (File)
Nov 15 2024, 12:14 PM
Unknown Object (File)
Oct 3 2024, 3:22 PM
Unknown Object (File)
Sep 22 2024, 7:08 AM
Unknown Object (File)
Sep 19 2024, 7:58 AM

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

kib requested review of this revision.Sep 28 2023, 10:20 AM

I've never used the clone KPIs before, so please forgive my ignorance in asking a couple of basic questions:

--If clone_create() returns an existing cdev, how do we expect the new reference we've taken to later be released? I don't see direct calls to dev_rel(), and it's unclear how clone_cleanup() vs. destroy_dev() are supposed to be used in these cases.

--If clone_create() doesn't return an existing cdev (but does allocate a new unit and a new cdev on the clone list), and we then create a new tunX node through tun_create_device(), how does that newly-added cdev on the clone list related to the cdev we just created through tun_create_device()? And why do we need an additional reference on the device created by tun_create_device()?

sys/net/if_tuntap.c
820

Should we have been unconditionally referencing it upon creation here, rather than just for the one caller where there's an associated ucred?

Devfs clones is a way to handle (reserve) unit numbers. It seems that phk decided that the least involved way to code it is to just keep whole cdev with the unit number somewhere (on the clone list). These clones are not referenced, they exist by mere fact being on the clone list. When device driver allocates clone, it must make it fully correct, including the ref count.

References on cdev protect freeing of the device memory, they do not determine the lifecycle of the device. Device is created with make_dev() and destroyed with destroy_dev(), the later does not free the memory and does not even drop a reference. Devfs nodes are managed out of the driver context, by combination of dev_clone eventhandler and devfs_populate_loop() top-level code. Eventhandler is supposed to return device with additional reference to protect against parallel populate loop, and loop is the code which usually dereferences the last ref on destroyed (in destroy_dev() sense) device.

So typical driver does not need to manage dev_ref()/dev_rel() except for initial device creation, where clones and dev_clone context add some complications.

sys/net/if_tuntap.c
820

The device is referenced there (MAKEDEV_REF) since it is created from dev_clone context. But not for device picked up from clone list.

sys/net/if_tuntap.c
820

Right, so do we actually need an extra ref in that case? It's not clear to me that the added ref in the tunclone case shouldn't be scoped to clone_create returning a pre-existing device

In D42008#958212, @kib wrote:

Devfs clones is a way to handle (reserve) unit numbers. It seems that phk decided that the least involved way to code it is to just keep whole cdev with the unit number somewhere (on the clone list). These clones are not referenced, they exist by mere fact being on the clone list. When device driver allocates clone, it must make it fully correct, including the ref count.

References on cdev protect freeing of the device memory, they do not determine the lifecycle of the device. Device is created with make_dev() and destroyed with destroy_dev(), the later does not free the memory and does not even drop a reference. Devfs nodes are managed out of the driver context, by combination of dev_clone eventhandler and devfs_populate_loop() top-level code. Eventhandler is supposed to return device with additional reference to protect against parallel populate loop, and loop is the code which usually dereferences the last ref on destroyed (in destroy_dev() sense) device.

So typical driver does not need to manage dev_ref()/dev_rel() except for initial device creation, where clones and dev_clone context add some complications.

Thanks for the explanation! The event handler and its interaction with devfs_lookup() is the part I was missing, so the need for a reference on an existing clone makes sense then. But, just as with Kyle's question below, I'm still having trouble understanding why we need the additional ref on the cdev created by tun_create_device().

From the original PR it also sounds as though this sort of refcounting issue is a common problem with drivers that use the clone facility? Could clone_create() be changed to automatically add the reference to an existing device, or perhaps a wrapper around clone_create() that does this automatically? Or would that merely create different complications elsewhere?

I propose to leave the clone_create() alone. I did the investigation, and results are:

solo% git grep -e clone_create -- sys | grep -E '[^a-zA-Z0-9_]clone_create[^a-zA-Z0-9_]' | grep -vF kern_conf.c | grep -vF sys/conf.h
sys/dev/vkbd/vkbd.c:    if (clone_create(&vkbd_dev_clones, &vkbd_dev_cdevsw, &unit, dev, 0))
sys/net/if_tuntap.c:    i = clone_create(&drv->clones, &drv->cdevsw, &unit, &dev, 0);
sys/net/if_tuntap.c:    i = clone_create(&drv->clones, &drv->cdevsw, &u, dev, 0);

So I propose to add a fix to vkbd.c and then eventually somebody would kill clone_create() for better.

This revision was not accepted when it landed; it landed in state Needs Review.Oct 9 2023, 11:38 PM
This revision was automatically updated to reflect the committed changes.
sys/net/if_tuntap.c
820

I never actually got an answer to this one, but I see now that the dev_clone caller will release the extra reference in all paths

sys/net/if_tuntap.c
820

I thought that my initial (long) response provided the answer. It may be that it was not explicit enough, sorry then.