Page MenuHomeFreeBSD

lockf: ensure atomicity of lockf for open(O_CREAT|O_EXCL|O_EXLOCK)
ClosedPublic

Authored by kib on Tue, Feb 16, 1:37 AM.

Details

Summary

or EX_SHLOCK. Do it by setting a vnode iflag indicating that the locking exclusive open is in progress, and not allowing F_LOCK request to make a progress until the first open finishes.

NB. Perhaps this should be enabled only for local filesystems.

Test Plan

Peter has a test written to expose the problem.

Diff Detail

Repository
R10 FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

kib requested review of this revision.Tue, Feb 16, 1:37 AM
kib created this revision.
sys/kern/vfs_vnops.c
287

So VOP_VPUT_PAIR may update vp, so that ndp->ni_vp != vp. Is it intentional?

330

Why do you use both ndp->ni_vp and vp here?

sys/sys/vnode.h
256
kib marked 3 inline comments as done.Tue, Feb 16, 2:45 PM
kib added inline comments.
sys/kern/vfs_vnops.c
287

Except for sillyness in clearing VI_FOPENING that you pointed out, I do not think that code uses ndp->ni_vp at all in this path after assignment to vp.

Other pathes do use ni_vp, but not the path where VOP_VPUT_PAIR() is executed.

Am I missing something?

330

It was thinko from copy/paste.

kib marked 2 inline comments as done.

Fix use of nd_vp instead of vp in clearing of the first_open flag.
Grammar fix in comment.

sys/kern/vfs_default.c
435

So what about filesystems that do not use the default implementation? Should this be done in a vop_advlock_pre() routine?

sys/kern/vfs_vnops.c
287

I think it is ok now, I was just trying to see if VI_FOPENING may be leaked somehow.

kib marked an inline comment as done.Tue, Feb 16, 4:02 PM
kib added inline comments.
sys/kern/vfs_default.c
435

No I do not think so. The idea of atomicity for O_EXCL|O_EXLOCK does not make sense for remote filesystems, because we cannot enforce anything on protocol, and they are exactly the filesystems that use non-default vop_advlock implementation.

This is why the comment about local filesystems in the commit message.
I.e. I would rather add the check for MNT_LOCAL and do nothing if not set, but so far it seems to be not needed.

markj added inline comments.
sys/kern/vfs_default.c
435

There is also FUSE. The latter falls back to stdadvlock in some cases, not sure if it coincides exactly with remote vs. local.

This revision is now accepted and ready to land.Tue, Feb 16, 4:15 PM

Overall looks good.

sys/kern/vfs_default.c
435

I would suggest either putting a KASSERT that MNT_LOCAL is set or add the MNT_LOCAL test if FUSE may use it on a non-local filesystem.

kib marked an inline comment as done.

Only wait for VI_FOPENING for local filesystems.

This revision now requires review to proceed.Wed, Feb 17, 8:29 AM
sys/kern/vfs_default.c
435

Is this an unlocked load?

kib marked an inline comment as done.Wed, Feb 17, 3:42 PM
kib added inline comments.
sys/kern/vfs_default.c
435

Yes, it is fine. The flag is never changed after mount.

markj added inline comments.
sys/kern/vfs_default.c
435

I meant, is it guaranteed that the reference to mp itself is safe? Mount points are type-stable at least, but the flags value may be stale, unless I misunderstood. It looks harmless but might deserve a comment, again assuming I did not misunderstand.

This revision is now accepted and ready to land.Wed, Feb 17, 4:13 PM
kib marked an inline comment as done.Wed, Feb 17, 4:44 PM
kib added inline comments.
sys/kern/vfs_default.c
435

In the worst case of unmount/momental mp reuse, we might see undeserved MNT_LOCAL and wait for the passing open when we should not.

Or we might not see MNT_LOCAL and then at unmount it would happen that lock is obtained while open is still processing. But in this case, open(2) would return a file descriptor referencing reclaimed vnode, otherwise mp cannot be reused. So no future operations with it is possible.