Details

Reviewers

Summary

This is a follow up to the e-mail thread about vtryrecycle.

It incorrectly holds a vnode and allows the usecount to grow and if it happens it aborts recycle leaving the other thread with an incorrectly held vnode. The fix is to do everything with the vnode interlock held. For this to happen the following provisions were added:

VOP_LOCK can now ignore the interlock
vn_start_write_nb is added. it guarantees no locks will be taken. Note this effectively depends on the switch to atomics (or per-cpu) as trylocking on the mnt interlock would be very likely to fail due to contention. This also changes semantics a little bit in that non-suspendable mount points get returned with a reference held. This can be augmented later to avoid refing them in the first place OR dropping the opt out altogether, now that the bottleneck is fixed. Semantics change is needed since by the time we determine the mount point is not suspendable per-cpu operation can be disabled and dropping the ref would require taking the lock.

The same treatment can be made in vlrureclaim to avoid the need to re-check anything.

Also note the treatment in unionfs already looks incorrect due to the missing interlock in the common case. For nonblocking case we go in with the interlock (can add pre/post assert routines for it).

Test Plan

will ask pho, passes suj11 for me

Diff Detail

Lint

Lint Skipped

Unit

Tests Skipped

Event Timeline

mjg created this revision.Sep 15 2019, 8:02 PM

Herald added a subscriber: imp. · View Herald TranscriptSep 15 2019, 8:02 PM

Harbormaster completed remote builds in B26510: Diff 62131.Sep 15 2019, 8:02 PM

mjg edited the summary of this revision. (Show Details)Sep 15 2019, 8:03 PM

mjg edited the summary of this revision. (Show Details)

vdropl before vn_finished_write

Why do you need vn_start_write_nb() when vn_start_write() already takes the flag argument ?

rebase

plug a blatant mp ref leak

I think it's cleaner to handle this separately mostly because this function cannot do any clean up so it behaves quite differently compared to vn_start_write. If you insist I can fold it in.

In D21667#472964, @mjg wrote:

I think it's cleaner to handle this separately mostly because this function cannot do any clean up so it behaves quite differently compared to vn_start_write. If you insist I can fold it in.

I highly dislike growing the VFS interface in ah-hoc manner with the only cause of fixing some self-inflicted bug. Please use a flag to select this behavior, the flag might as well just call into the static vn_start_write_nb().

make vn_start_write_nb static
add a comment explaining error handling
assert LK_NOWAIT passed if LK_IGNORE_INTERLOCK is used

kib accepted this revision.Sep 17 2019, 8:19 PM

kib added inline comments.

sys/kern/vfs_subr.c
1441	There should be no space before label.
sys/kern/vfs_vnops.c
1672	However
1678	!= 0 there and line below.

This revision is now accepted and ready to land.Sep 17 2019, 8:19 PM

Turns out there is a bug with in handling of nullfs and unionfs. Their routines may end up taking the interlock. Taking care of nullfs is easy, but unionfs requires a little more work. Found by pho.

Fixed as a side effect of D22997

vfs: fix incorrect hold in vtryrecycle
AbandonedPublic
Actions

Details

Diff Detail

Event Timeline

Revision Contents
Changeset List

Diff 62214

sys/fs/nullfs/null_vnops.c

sys/fs/unionfs/union_vnops.c

sys/kern/vfs_default.c

sys/kern/vfs_subr.c

sys/kern/vfs_vnops.c

sys/kern/vnode_if.src

sys/sys/lockmgr.h

sys/sys/vnode.h

sys/ufs/ffs/ffs_snapshot.c

sys/ufs/ffs/ffs_vnops.c

vfs: fix incorrect hold in vtryrecycleAbandonedPublicActions

Details

Diff Detail

Event Timeline

Revision ContentsChangeset List

Diff 62214

sys/fs/nullfs/null_vnops.c

sys/fs/unionfs/union_vnops.c

sys/kern/vfs_default.c

sys/kern/vfs_subr.c

sys/kern/vfs_vnops.c

sys/kern/vnode_if.src

sys/sys/lockmgr.h

sys/sys/vnode.h

sys/ufs/ffs/ffs_snapshot.c

sys/ufs/ffs/ffs_vnops.c

vfs: fix incorrect hold in vtryrecycle
AbandonedPublic
Actions

Revision Contents
Changeset List