Page MenuHomeFreeBSD

getfhat, fhlink, fhlinkat, fhreadlink: new file handle system calls
ClosedPublic

Authored by jack_gandi.net on Nov 27 2018, 4:17 PM.

Details

Summary

motivation: for a NFS userspace server (nfs-ganesha) we need to manipulate file handles, thus far we have interacted with the kernel through a custom kernel module. Now we are upstreaming these syscalls for better integration for everyone.

manpages: the new manpages for fhreadlink(2) and fhlink(2) we copied from readlink(2) and link(2) respectively, with some additional errno values.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
sys/kern/vfs_syscalls.c
4130 ↗(On Diff #51220)

Blank line after nonexistent local variables.

4174 ↗(On Diff #51220)

This function can be static.

4192 ↗(On Diff #51220)

This indentation doesn't look like it follows style(9)

4199 ↗(On Diff #51220)

No line break

4223 ↗(On Diff #51220)

Newline before function name.

4244 ↗(On Diff #51220)

This can be static.

(I did not repeated the style notes, please look at the style violations mentioned through the whole patch).

lib/libc/sys/Symbol.map
90 ↗(On Diff #51220)

Add context to the diff. Even without the context, I am sure that you put the symbols into the wrong namespace.

sys/kern/vfs_syscalls.c
4145 ↗(On Diff #51220)

Do not put initialization into local vars declaration.

That said, error var is pointless, just do return (kern_getfhat());.

4168 ↗(On Diff #51220)

There must be a blank line after '{' if there is no locals.

4171 ↗(On Diff #51220)

Continuation line should use +4 spaces.

4188 ↗(On Diff #51220)

if (path != NULL)

4271 ↗(On Diff #51220)

This comment is pointless. Either create a helper function used by both, or remove it.

Here's some more context to the diff. I wasn't aware of style(9), will check that before submitting in the future.

lib/libc/sys/Symbol.map
90 ↗(On Diff #51220)

Should I add a FBSD_1.6 section then?

jack_gandi.net added inline comments.
lib/libc/sys/Symbol.map
90 ↗(On Diff #51220)

nevermind, I just realized https://wiki.freebsd.org/AddingSyscalls says to put them in the most recent namespace (FBSD_1.5)

jack_gandi.net marked 17 inline comments as done.
jack_gandi.net marked 6 inline comments as done.
lib/libc/sys/Symbol.map
395 ↗(On Diff #51282)

No, FBSD_1.5 is wrong again. New 13-CURRENT symbols must go into FBSD_1.6.

sys/kern/vfs_syscalls.c
4271 ↗(On Diff #51220)

So I actually mean that duplication of the code is wrong, and I really do not see why do we need two copies. kern_linkat() after the first namei() is indeed exact copy, so why not add the helper ?

4148 ↗(On Diff #51282)

Blank line is needed after '{' since there are no local vars.

4174 ↗(On Diff #51282)

Wrong indent. Please review all the code for style issues that were already pointed out.

4193 ↗(On Diff #51282)

Put the binary op designator before the split.

flags & AT_BENEATH is not boolean, use (flags & AT_BENEATH) != 0. Same for AT_SYMLINK_NOFOLLOW.

4204 ↗(On Diff #51282)

LR_RETRY is allowed to return doomed vnode. You cannot access v_mount or call any VOP method on it.

That said, I find having the path == NULL case allowed for userspace somewhat strange. We do not allow NULL path for open(2), and we have getfh(2). I understand that this case allows to implement sys_gethf(), I am more about checking for path == NULL in getfhat().

4226 ↗(On Diff #51282)

Again blank line after '{'.

4377 ↗(On Diff #51282)

Same question there, IMO it is worth introducing the helper and avoid copy/paste.

In this diff there is no more code duplication. also I forgot to include the two new manpages in the previous versions (fhlink.2 and fhreadlink.2).

jack_gandi.net added inline comments.
sys/kern/vfs_syscalls.c
4271 ↗(On Diff #51220)

I've added a helper function, the code was not _exactly_ the same so I had to change the tag/goto into a do while/EAGAIN, also has to add a locked variable. If these changes are ok by you I think it's much cleaner now.

4204 ↗(On Diff #51282)

I agree the path = NULL case shouldn't be allowed for getfh, so I moved the logic into sys_getfhat.

sys/kern/capabilities.conf
472 ↗(On Diff #51359)

getfh is not enabled in cap mode, why do you enable getfhat ?

sys/kern/vfs_syscalls.c
113 ↗(On Diff #51359)

Wrong indent for continuation line.

115 ↗(On Diff #51359)

There too.

1513 ↗(On Diff #51359)

Blank line is not needed.

1518 ↗(On Diff #51359)

Blank line is not needed.

1524 ↗(On Diff #51359)

See the comment about vn_XXX namespace.

1525 ↗(On Diff #51359)

Wrong indentation for the continuation line

1538 ↗(On Diff #51359)

This call locks vnodes internally. Your use of vn_linkat() while the vp is locked can cause lock recursion at least. Unfortunately WITNESS cannot detect the issues because vnode locks have same names.

2533 ↗(On Diff #51359)

vn_XXX namespace is for functions in vfs_vnops.c. It is somewhat strange to use it for vfs_syscalls.c

2534 ↗(On Diff #51359)

Again indent.

2540 ↗(On Diff #51359)

I recomment to put an assert that vp is locked there.

It seems that you moved vput(vp) out of the helper's code to both callers. Is this done to make the vn_readlink() interface consistent (locked vnode in/locked vnode out) ?

2542 ↗(On Diff #51359)

No need to put {} around single line.

4283 ↗(On Diff #51359)

Stray blank line, more such lines below.

4316 ↗(On Diff #51359)

Remove all blank lines from the function starting with this one, they are not needed/allowed by style.

lib/libc/sys/fhlink.2
42 ↗(On Diff #51359)

s/link/fhlink/ ?

lib/libc/sys/getfh.2
190 ↗(On Diff #51359)

I'd use "points to an invalid address." here.

jack_gandi.net added inline comments.
sys/kern/vfs_syscalls.c
2533 ↗(On Diff #51359)

I wasn't sure about the naming, what would you recommend?

jack_gandi.net added inline comments.
sys/kern/vfs_syscalls.c
2540 ↗(On Diff #51359)

yes, I thought it would be cleaner to let the caller of vn_readlink manage the ressource

jack_gandi.net added inline comments.
sys/kern/vfs_syscalls.c
2540 ↗(On Diff #51359)

It seems that mac_vnode_check_readlink() already performs an ASSERT_VOP_LOCKED() within.

sys/kern/vfs_syscalls.c
2533 ↗(On Diff #51359)

kern_readlink_vp() for instance

2540 ↗(On Diff #51359)

mac_ stuff is conditional. Adding asserts at the start of the functions document their pre-conditions. We do not rely on asserts in the VOP_READLINK() wrapper, for similar reasons.

jack_gandi.net added inline comments.
sys/kern/vfs_syscalls.c
1538 ↗(On Diff #51359)

we perform a vput() before the first EAGAIN, the second EAGAIN will never be reached because the locked case will match the previous else is. so all shold be fine? maybe i could add a comment

jack_gandi.net edited the test plan for this revision. (Show Details)

temporary pathc, I'm still looking into the lock recursion that @kib has mentionned.

I was able to reproduce the lock recursion in the case where the source fhp points to the same object as the destination link. It turns ou the link(2) doesn't lock the source node which seems acceptable. So I changed fhlinkat_kern to only LK_SHARED on the source node, therefore the kern_linkat_vp doesn't need the bool locked anymore because all the callers have the same input locking state.

I was able to reproduce the lock recursion in the case where the source fhp points to the same object as the destination link. It turns ou the link(2) doesn't lock the source node which seems acceptable. So I changed fhlinkat_kern to only LK_SHARED on the source node, therefore the kern_linkat_vp doesn't need the bool locked anymore because all the callers have the same input locking state.

LK_SHARED is still locking the vnode, although in the shared mode. So from the brief look at the updated patch, you are leaking the vnode locks.

Do you run the tests with WITNESS + DEBUG_VFS_LOCKS kernel options ?

In D18359#391303, @kib wrote:

I was able to reproduce the lock recursion in the case where the source fhp points to the same object as the destination link. It turns ou the link(2) doesn't lock the source node which seems acceptable. So I changed fhlinkat_kern to only LK_SHARED on the source node, therefore the kern_linkat_vp doesn't need the bool locked anymore because all the callers have the same input locking state.

LK_SHARED is still locking the vnode, although in the shared mode. So from the brief look at the updated patch, you are leaking the vnode locks.

Do you run the tests with WITNESS + DEBUG_VFS_LOCKS kernel options ?

I wasn't aware of these options nor of potential leaks, mea culpa. I'll look into all this and upload a new patch, but maybe after the weekend.

No more locks, no more leaks as far as I can tell

sys/kern/vfs_syscalls.c
1536 ↗(On Diff #51522)

I do not see how can you claim that there is no longer the lock order/lock recursion issue. namei() locks the ni_vp vnode internally, and if LOCK_LEAF flag is not provided, it unlocks the vnode before returning. You simply cannot call namei() while any vnode is locked, it is very basic rule of our VFS.

4317 ↗(On Diff #51522)

Why exclusive lock ?

jack_gandi.net added inline comments.
sys/kern/vfs_syscalls.c
1536 ↗(On Diff #51522)

linkat doesn't LOCK_LEAF on the source node either so I've just added a VOP_UNLOCK as I haven't found a way to get VFS_FHTOVP to unlock before returning, the documenation is outdated and doesn't even show the flags argument to the macro.

4317 ↗(On Diff #51522)

you're right, readlinkat(2) does LOCKSHARED . behaviour should be the same with`fhreadlinkat`

This revision is now accepted and ready to land.Dec 4 2018, 3:40 PM

Peter. could you please take the diff and run it through stress2 ?

(You need to rebuild syscall tables in sys/kern and sys/compat/freebsd32 after applying the patch: make sysent).

lib/libc/sys/getfh.2
149 ↗(On Diff #51581)

This looks incomplete.

198 ↗(On Diff #51581)

Missing a period.

This revision now requires review to proceed.Dec 5 2018, 9:30 AM

I have run all of the stress2 tests with D18359.id51596.diff without seeing any problems.

lib/libc/sys/fhlink.2
2 ↗(On Diff #51596)

I am sure that UCB has nothing to do with the new file. Please update copyright with the proper reference (to your personally or to Gandi). Same for fhreadlink.

BTW, preferred license for the new files is two clause BSD, see share/examples/etc/bsd-style-copyright. Please consider using it.

jack_gandi.net marked an inline comment as done.
lib/libc/sys/getfh.2
3 ↗(On Diff #51701)

You cannot remove other people license. You can add your copyright notice since you added substantial new content, but since the the original content is there, you must not touch the UCB statement.

Look at other files how multiple copyright notices are handled.

This revision was not accepted when it landed; it landed in state Needs Review.Dec 7 2018, 3:18 PM
This revision was automatically updated to reflect the committed changes.