Page MenuHomeFreeBSD

devfs: rework si_usecount to track opens
ClosedPublic

Authored by mjg on Jul 10 2020, 6:08 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mon, Jan 13, 4:07 AM
Unknown Object (File)
Fri, Jan 10, 12:21 AM
Unknown Object (File)
Thu, Jan 9, 12:30 PM
Unknown Object (File)
Tue, Jan 7, 1:03 AM
Unknown Object (File)
Sun, Jan 5, 12:16 PM
Unknown Object (File)
Sun, Jan 5, 11:23 AM
Unknown Object (File)
Sun, Jan 5, 11:04 AM
Unknown Object (File)
Fri, Jan 3, 10:57 AM
Subscribers

Details

Summary

Since open/close don't always match up in the vfs layer, add a counter to dirent. Should the vnode get doomed it provides information of how much should be subtracted from si_usecount.

It is race-compatible with previous code with one exception: transient v_usecount changes (e.g., exporting tty from sysctl) will no longer interfere. Other problems remain, for example someone else opening the tty for reading will prevent devfs_close from cleaning up in the session. I consider this beyond the scope of the patch.

This allows for much welcome removal of VCHR special casing in vnode refcount management routines.

Test Plan

will ask pho

Running various things (tmux, sshd) and having vcount report on a mismatch between new and old method, the only mismatch found was coming from calls to revoke. There the vnode is not open, but it has a usecount of 1. Consequently the new method reports 0, while the old reports 1. This is fine, we just have to act on count > 0.

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

mjg requested review of this revision.Jul 10 2020, 6:08 PM
sys/kern/vfs_syscalls.c
4200

this should be > 0

mjg edited the test plan for this revision. (Show Details)
sys/fs/devfs/devfs.h
156

Explain in the comment why new flags fleld is added.

And actually what protects de_flags2 ? The vnode lock of de_vnode ?

sys/fs/devfs/devfs.h
157

By 'associated' you mean de_vnode, right ? Why not say it directly ?

sys/fs/devfs/devfs_vnops.c
669

Why VN_IS_DOOMED() test appeared there ? Wouldn't it cause useless (or panicing, not sure) calls to devfs_usecount_dec() ?

671

The check for v_usecount == 1 was used to try to avoid issues with parallel opens and close sometimes mismatched or loose FLASTCLOSE calls. You patch makes the race wider, by postponing usecount_inc() after d_open() method returns (d_open()/d_close() often sleep).

bde tried to handle the race in a way that would be acceptable for drivers, mostly serial hardware, but it was too complex by requiring inspecting each tty driver. I hesitate to change this aspect of devfs without having some plan how to solve that race.

mjg retitled this revision from devfs: rework si_usecount to track opened vnodes to devfs: rework si_usecount to track opens.
mjg edited the summary of this revision. (Show Details)
mjg edited the test plan for this revision. (Show Details)
  • rework to fix a bug where close could let the flag stay due to v_usecount > 1 and failing open would never clear it, making it remain when the device is in fact closed. see the description.
sys/fs/devfs/devfs_vnops.c
237

For dereferencing v_data for devfs vnode, you either need to have the vnode lock, or devfs_de_interlock. Same is true for de->de_vnode.

If you own the vnode lock always there, you should assert it, otherwise interlock help is needed.

1464

Why is this call needed ? Wouldn't you mechanism only needed for VCHR vnodes ?

BTW, I think that you should assert the vnode type in devfs_usecount_*().

  • stop relying on doomed state
  • only do the counting for VCHR
  • retire vcount in favor of devfs_usecount

So as I understand, we never vref() devfs chr vnodes on open after that, right ? If yes, this means that forced vs. non-forced unmounts for devfs are same, unless I miss something.

Later note: so no, usecount is incremented by namei() and there is a window were v_usecount is incremented by de_usecount is not, and now vn_open_cred()/vn_open_vnode() strongly depend on the vnode lock not being dropped between lookup an VOP_OPEN(). Otherwise, e.g. revoke(2) is broken.

sys/fs/devfs/devfs_vnops.c
232

Why do you need VI_LOCK() in all places covered by devfs_de_interlock ?
May be, declare that device usecount is protected by the de_interlock mtx ?

668–671

If devfs_usecountl() == 1, doesn't it imply that de_usecount is also 1 ? Otherwise, how did we get close on this de/vp ?

Or do you think there is a race ? If not, I think it would be an interesting experiment to commit this with assert and see what happens in wild.

I don't think revoke situation is much of a problem -- it's userspace racing with itself. However, if need be, this can be restored to previous state with some churn -- devfs_revoke has a loop where it walks the entire vnode list, we can check that for any non-0 v_usecount. Also note the conditional dooming is already suspicious, perhaps we can just always do it and that will also take care of it.

sys/fs/devfs/devfs_vnops.c
232

I need stable ->v_data which right now is either dev lock or vnode interlock.

668–671

There can be no race here as both counters are always updated together with devfs_de_interlock held. The separate check is a leftover from previous code which had "updated the global count on 0<->1 transitions of local count" policy. I'll remove it. Said policy is beneficial to implement as it would save on global lock acquire in the common case, but runs into lock ordering issues with vnode interlock. Iow it's just some mess which can be changed later.

the patch was tested by pho, no issues

  • drop the spurious de->de_usecount == 1 check
sys/fs/devfs/devfs_vnops.c
232

v_data is under both vnode lock and de_interlock, and _not_ protected by the vnode interlock. Look at devfs_reclaim.

That was a brainfart, I meant v_rdev.

In D25612#577266, @mjg wrote:

That was a brainfart, I meant v_rdev.

But do you need v_rdev ? de->de_cdp.cdp_c is it.

That said, the intent for v_rdev was always vnode lock.

sys/fs/devfs/devfs_vnops.c
295

static ?

319

Can this become static ?

So later on I plan to restore the "only update the inside counter on local 1<->0 transitions" so that the code mostly gets away with the vnode interlock (and not taking devfs_de_interlock), thus I don't want to make any changes which make it harder. I'm not doing it in this patch as it requires a little hairy relocking and I may end up changing the locking scheme a little bit. This change is in my opinion a good enough logical step on its own with focus on removing usecount management from bowels of the vfs layer.

sys/fs/devfs/devfs_vnops.c
295

will make it static

319

unref is called from places outside of the file, I think it would be weird to have a mismatch here

This revision is now accepted and ready to land.Aug 11 2020, 1:56 PM
This revision was automatically updated to reflect the committed changes.