Page MenuHomeFreeBSD

devfs: rework si_usecount to track opens
ClosedPublic

Authored by mjg on Jul 10 2020, 6:08 PM.

Details

Summary

Since open/close don't always match up in the vfs layer, add a counter to dirent. Should the vnode get doomed it provides information of how much should be subtracted from si_usecount.

It is race-compatible with previous code with one exception: transient v_usecount changes (e.g., exporting tty from sysctl) will no longer interfere. Other problems remain, for example someone else opening the tty for reading will prevent devfs_close from cleaning up in the session. I consider this beyond the scope of the patch.

This allows for much welcome removal of VCHR special casing in vnode refcount management routines.

Test Plan

will ask pho

Running various things (tmux, sshd) and having vcount report on a mismatch between new and old method, the only mismatch found was coming from calls to revoke. There the vnode is not open, but it has a usecount of 1. Consequently the new method reports 0, while the old reports 1. This is fine, we just have to act on count > 0.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

mjg created this revision.Jul 10 2020, 6:08 PM
mjg requested review of this revision.Jul 10 2020, 6:08 PM
mjg added inline comments.Jul 10 2020, 6:09 PM
sys/kern/vfs_syscalls.c
4200 ↗(On Diff #74280)

this should be > 0

mjg updated this revision to Diff 74284.Jul 10 2020, 6:22 PM
mjg edited the test plan for this revision. (Show Details)
kib added inline comments.Jul 10 2020, 7:10 PM
sys/fs/devfs/devfs.h
156 ↗(On Diff #74284)

Explain in the comment why new flags fleld is added.

And actually what protects de_flags2 ? The vnode lock of de_vnode ?

mjg updated this revision to Diff 74287.Jul 10 2020, 7:25 PM
  • comment de_flags2
kib added inline comments.Jul 10 2020, 9:07 PM
sys/fs/devfs/devfs.h
157 ↗(On Diff #74287)

By 'associated' you mean de_vnode, right ? Why not say it directly ?

sys/fs/devfs/devfs_vnops.c
669 ↗(On Diff #74287)

Why VN_IS_DOOMED() test appeared there ? Wouldn't it cause useless (or panicing, not sure) calls to devfs_usecount_dec() ?

671 ↗(On Diff #74287)

The check for v_usecount == 1 was used to try to avoid issues with parallel opens and close sometimes mismatched or loose FLASTCLOSE calls. You patch makes the race wider, by postponing usecount_inc() after d_open() method returns (d_open()/d_close() often sleep).

bde tried to handle the race in a way that would be acceptable for drivers, mostly serial hardware, but it was too complex by requiring inspecting each tty driver. I hesitate to change this aspect of devfs without having some plan how to solve that race.

mjg updated this revision to Diff 74291.Jul 11 2020, 8:35 AM
mjg retitled this revision from devfs: rework si_usecount to track opened vnodes to devfs: rework si_usecount to track opens.
mjg edited the summary of this revision. (Show Details)
mjg edited the test plan for this revision. (Show Details)
  • rework to fix a bug where close could let the flag stay due to v_usecount > 1 and failing open would never clear it, making it remain when the device is in fact closed. see the description.
mjg updated this revision to Diff 75132.Wed, Jul 29, 4:24 PM
  • rebase
mjg added a comment.Thu, Jul 30, 4:04 PM

tested by pho

kib added inline comments.Sat, Aug 1, 8:21 PM
sys/fs/devfs/devfs_vnops.c
237 ↗(On Diff #75132)

For dereferencing v_data for devfs vnode, you either need to have the vnode lock, or devfs_de_interlock. Same is true for de->de_vnode.

If you own the vnode lock always there, you should assert it, otherwise interlock help is needed.

1497 ↗(On Diff #75132)

Why is this call needed ? Wouldn't you mechanism only needed for VCHR vnodes ?

BTW, I think that you should assert the vnode type in devfs_usecount_*().

mjg updated this revision to Diff 75603.Sat, Aug 8, 6:32 AM
  • stop relying on doomed state
  • only do the counting for VCHR
  • retire vcount in favor of devfs_usecount
kib added a comment.EditedSun, Aug 9, 9:47 PM

So as I understand, we never vref() devfs chr vnodes on open after that, right ? If yes, this means that forced vs. non-forced unmounts for devfs are same, unless I miss something.

Later note: so no, usecount is incremented by namei() and there is a window were v_usecount is incremented by de_usecount is not, and now vn_open_cred()/vn_open_vnode() strongly depend on the vnode lock not being dropped between lookup an VOP_OPEN(). Otherwise, e.g. revoke(2) is broken.

sys/fs/devfs/devfs_vnops.c
232 ↗(On Diff #75603)

Why do you need VI_LOCK() in all places covered by devfs_de_interlock ?
May be, declare that device usecount is protected by the de_interlock mtx ?

734 ↗(On Diff #75603)

If devfs_usecountl() == 1, doesn't it imply that de_usecount is also 1 ? Otherwise, how did we get close on this de/vp ?

Or do you think there is a race ? If not, I think it would be an interesting experiment to commit this with assert and see what happens in wild.

mjg added a comment.Mon, Aug 10, 1:57 AM

I don't think revoke situation is much of a problem -- it's userspace racing with itself. However, if need be, this can be restored to previous state with some churn -- devfs_revoke has a loop where it walks the entire vnode list, we can check that for any non-0 v_usecount. Also note the conditional dooming is already suspicious, perhaps we can just always do it and that will also take care of it.

sys/fs/devfs/devfs_vnops.c
232 ↗(On Diff #75603)

I need stable ->v_data which right now is either dev lock or vnode interlock.

734 ↗(On Diff #75603)

There can be no race here as both counters are always updated together with devfs_de_interlock held. The separate check is a leftover from previous code which had "updated the global count on 0<->1 transitions of local count" policy. I'll remove it. Said policy is beneficial to implement as it would save on global lock acquire in the common case, but runs into lock ordering issues with vnode interlock. Iow it's just some mess which can be changed later.

mjg added a comment.Mon, Aug 10, 10:39 AM

the patch was tested by pho, no issues

mjg updated this revision to Diff 75654.Tue, Aug 11, 2:05 AM
  • drop the spurious de->de_usecount == 1 check
kib added inline comments.Tue, Aug 11, 10:25 AM
sys/fs/devfs/devfs_vnops.c
232 ↗(On Diff #75603)

v_data is under both vnode lock and de_interlock, and _not_ protected by the vnode interlock. Look at devfs_reclaim.

mjg added a comment.Tue, Aug 11, 10:33 AM

That was a brainfart, I meant v_rdev.

kib added a comment.Tue, Aug 11, 11:12 AM
In D25612#577266, @mjg wrote:

That was a brainfart, I meant v_rdev.

But do you need v_rdev ? de->de_cdp.cdp_c is it.

That said, the intent for v_rdev was always vnode lock.

sys/fs/devfs/devfs_vnops.c
295 ↗(On Diff #75654)

static ?

319 ↗(On Diff #75654)

Can this become static ?

mjg added a comment.EditedTue, Aug 11, 11:51 AM

So later on I plan to restore the "only update the inside counter on local 1<->0 transitions" so that the code mostly gets away with the vnode interlock (and not taking devfs_de_interlock), thus I don't want to make any changes which make it harder. I'm not doing it in this patch as it requires a little hairy relocking and I may end up changing the locking scheme a little bit. This change is in my opinion a good enough logical step on its own with focus on removing usecount management from bowels of the vfs layer.

sys/fs/devfs/devfs_vnops.c
295 ↗(On Diff #75654)

will make it static

319 ↗(On Diff #75654)

unref is called from places outside of the file, I think it would be weird to have a mismatch here

kib accepted this revision.Tue, Aug 11, 1:56 PM
This revision is now accepted and ready to land.Tue, Aug 11, 1:56 PM
This revision was automatically updated to reflect the committed changes.