Page MenuHomeFreeBSD

getfhat, fhlink, fhlinkat, fhreadlink: new file handle system calls
ClosedPublic

Authored by jack_gandi.net on Nov 27 2018, 4:17 PM.
Referenced Files
Unknown Object (File)
Mon, Oct 28, 4:50 PM
Unknown Object (File)
Sun, Oct 27, 10:53 AM
Unknown Object (File)
Fri, Oct 25, 10:06 AM
Unknown Object (File)
Sun, Oct 20, 9:33 PM
Unknown Object (File)
Tue, Oct 15, 12:55 AM
Unknown Object (File)
Mon, Oct 14, 7:35 AM
Unknown Object (File)
Mon, Oct 14, 7:32 AM
Unknown Object (File)
Fri, Oct 11, 2:47 AM
Subscribers

Details

Summary

motivation: for a NFS userspace server (nfs-ganesha) we need to manipulate file handles, thus far we have interacted with the kernel through a custom kernel module. Now we are upstreaming these syscalls for better integration for everyone.

manpages: the new manpages for fhreadlink(2) and fhlink(2) we copied from readlink(2) and link(2) respectively, with some additional errno values.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
sys/kern/vfs_syscalls.c
4130

Blank line after nonexistent local variables.

4174

This function can be static.

4192

This indentation doesn't look like it follows style(9)

4199

No line break

4223

Newline before function name.

4244

This can be static.

(I did not repeated the style notes, please look at the style violations mentioned through the whole patch).

lib/libc/sys/Symbol.map
90

Add context to the diff. Even without the context, I am sure that you put the symbols into the wrong namespace.

sys/kern/vfs_syscalls.c
4145

Do not put initialization into local vars declaration.

That said, error var is pointless, just do return (kern_getfhat());.

4168

There must be a blank line after '{' if there is no locals.

4171

Continuation line should use +4 spaces.

4188

if (path != NULL)

4271

This comment is pointless. Either create a helper function used by both, or remove it.

Here's some more context to the diff. I wasn't aware of style(9), will check that before submitting in the future.

lib/libc/sys/Symbol.map
90

Should I add a FBSD_1.6 section then?

jack_gandi.net added inline comments.
lib/libc/sys/Symbol.map
90

nevermind, I just realized https://wiki.freebsd.org/AddingSyscalls says to put them in the most recent namespace (FBSD_1.5)

jack_gandi.net marked 17 inline comments as done.
jack_gandi.net marked 6 inline comments as done.
lib/libc/sys/Symbol.map
395

No, FBSD_1.5 is wrong again. New 13-CURRENT symbols must go into FBSD_1.6.

sys/kern/vfs_syscalls.c
4148

Blank line is needed after '{' since there are no local vars.

4174

Wrong indent. Please review all the code for style issues that were already pointed out.

4193

Put the binary op designator before the split.

flags & AT_BENEATH is not boolean, use (flags & AT_BENEATH) != 0. Same for AT_SYMLINK_NOFOLLOW.

4204

LR_RETRY is allowed to return doomed vnode. You cannot access v_mount or call any VOP method on it.

That said, I find having the path == NULL case allowed for userspace somewhat strange. We do not allow NULL path for open(2), and we have getfh(2). I understand that this case allows to implement sys_gethf(), I am more about checking for path == NULL in getfhat().

4226

Again blank line after '{'.

4271

So I actually mean that duplication of the code is wrong, and I really do not see why do we need two copies. kern_linkat() after the first namei() is indeed exact copy, so why not add the helper ?

4377

Same question there, IMO it is worth introducing the helper and avoid copy/paste.

In this diff there is no more code duplication. also I forgot to include the two new manpages in the previous versions (fhlink.2 and fhreadlink.2).

jack_gandi.net added inline comments.
sys/kern/vfs_syscalls.c
4204

I agree the path = NULL case shouldn't be allowed for getfh, so I moved the logic into sys_getfhat.

4271

I've added a helper function, the code was not _exactly_ the same so I had to change the tag/goto into a do while/EAGAIN, also has to add a locked variable. If these changes are ok by you I think it's much cleaner now.

sys/kern/capabilities.conf
472

getfh is not enabled in cap mode, why do you enable getfhat ?

sys/kern/vfs_syscalls.c
113

Wrong indent for continuation line.

115

There too.

1513

Blank line is not needed.

1518

Blank line is not needed.

1524

See the comment about vn_XXX namespace.

1525

Wrong indentation for the continuation line

1538

This call locks vnodes internally. Your use of vn_linkat() while the vp is locked can cause lock recursion at least. Unfortunately WITNESS cannot detect the issues because vnode locks have same names.

2533

vn_XXX namespace is for functions in vfs_vnops.c. It is somewhat strange to use it for vfs_syscalls.c

2534

Again indent.

2540

I recomment to put an assert that vp is locked there.

It seems that you moved vput(vp) out of the helper's code to both callers. Is this done to make the vn_readlink() interface consistent (locked vnode in/locked vnode out) ?

2542

No need to put {} around single line.

4283

Stray blank line, more such lines below.

4316

Remove all blank lines from the function starting with this one, they are not needed/allowed by style.

lib/libc/sys/fhlink.2
42 ↗(On Diff #51359)

s/link/fhlink/ ?

lib/libc/sys/getfh.2
190

I'd use "points to an invalid address." here.

jack_gandi.net added inline comments.
sys/kern/vfs_syscalls.c
2533

I wasn't sure about the naming, what would you recommend?

jack_gandi.net added inline comments.
sys/kern/vfs_syscalls.c
2540

yes, I thought it would be cleaner to let the caller of vn_readlink manage the ressource

jack_gandi.net added inline comments.
sys/kern/vfs_syscalls.c
2540

It seems that mac_vnode_check_readlink() already performs an ASSERT_VOP_LOCKED() within.

sys/kern/vfs_syscalls.c
2533

kern_readlink_vp() for instance

2540

mac_ stuff is conditional. Adding asserts at the start of the functions document their pre-conditions. We do not rely on asserts in the VOP_READLINK() wrapper, for similar reasons.

jack_gandi.net added inline comments.
sys/kern/vfs_syscalls.c
1538

we perform a vput() before the first EAGAIN, the second EAGAIN will never be reached because the locked case will match the previous else is. so all shold be fine? maybe i could add a comment

jack_gandi.net edited the test plan for this revision. (Show Details)

temporary pathc, I'm still looking into the lock recursion that @kib has mentionned.

I was able to reproduce the lock recursion in the case where the source fhp points to the same object as the destination link. It turns ou the link(2) doesn't lock the source node which seems acceptable. So I changed fhlinkat_kern to only LK_SHARED on the source node, therefore the kern_linkat_vp doesn't need the bool locked anymore because all the callers have the same input locking state.

I was able to reproduce the lock recursion in the case where the source fhp points to the same object as the destination link. It turns ou the link(2) doesn't lock the source node which seems acceptable. So I changed fhlinkat_kern to only LK_SHARED on the source node, therefore the kern_linkat_vp doesn't need the bool locked anymore because all the callers have the same input locking state.

LK_SHARED is still locking the vnode, although in the shared mode. So from the brief look at the updated patch, you are leaking the vnode locks.

Do you run the tests with WITNESS + DEBUG_VFS_LOCKS kernel options ?

In D18359#391303, @kib wrote:

I was able to reproduce the lock recursion in the case where the source fhp points to the same object as the destination link. It turns ou the link(2) doesn't lock the source node which seems acceptable. So I changed fhlinkat_kern to only LK_SHARED on the source node, therefore the kern_linkat_vp doesn't need the bool locked anymore because all the callers have the same input locking state.

LK_SHARED is still locking the vnode, although in the shared mode. So from the brief look at the updated patch, you are leaking the vnode locks.

Do you run the tests with WITNESS + DEBUG_VFS_LOCKS kernel options ?

I wasn't aware of these options nor of potential leaks, mea culpa. I'll look into all this and upload a new patch, but maybe after the weekend.

No more locks, no more leaks as far as I can tell

sys/kern/vfs_syscalls.c
1536

I do not see how can you claim that there is no longer the lock order/lock recursion issue. namei() locks the ni_vp vnode internally, and if LOCK_LEAF flag is not provided, it unlocks the vnode before returning. You simply cannot call namei() while any vnode is locked, it is very basic rule of our VFS.

4317

Why exclusive lock ?

jack_gandi.net added inline comments.
sys/kern/vfs_syscalls.c
1536

linkat doesn't LOCK_LEAF on the source node either so I've just added a VOP_UNLOCK as I haven't found a way to get VFS_FHTOVP to unlock before returning, the documenation is outdated and doesn't even show the flags argument to the macro.

4317

you're right, readlinkat(2) does LOCKSHARED . behaviour should be the same with`fhreadlinkat`

This revision is now accepted and ready to land.Dec 4 2018, 3:40 PM

Peter. could you please take the diff and run it through stress2 ?

(You need to rebuild syscall tables in sys/kern and sys/compat/freebsd32 after applying the patch: make sysent).

lib/libc/sys/getfh.2
149

This looks incomplete.

198

Missing a period.

This revision now requires review to proceed.Dec 5 2018, 9:30 AM

I have run all of the stress2 tests with D18359.id51596.diff without seeing any problems.

lib/libc/sys/fhlink.2
2 ↗(On Diff #51596)

I am sure that UCB has nothing to do with the new file. Please update copyright with the proper reference (to your personally or to Gandi). Same for fhreadlink.

BTW, preferred license for the new files is two clause BSD, see share/examples/etc/bsd-style-copyright. Please consider using it.

jack_gandi.net marked an inline comment as done.
lib/libc/sys/getfh.2
3

You cannot remove other people license. You can add your copyright notice since you added substantial new content, but since the the original content is there, you must not touch the UCB statement.

Look at other files how multiple copyright notices are handled.

This revision was not accepted when it landed; it landed in state Needs Review.Dec 7 2018, 3:18 PM
This revision was automatically updated to reflect the committed changes.