Page MenuHomeFreeBSD

Implement fdescfs for Linuxulator
ClosedPublic

Authored by dchagin on Jul 2 2017, 7:22 PM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Oct 22, 3:19 AM
Unknown Object (File)
Sun, Oct 19, 7:49 AM
Unknown Object (File)
Sat, Oct 18, 7:17 AM
Unknown Object (File)
Mon, Oct 13, 4:52 PM
Unknown Object (File)
Thu, Oct 9, 4:06 PM
Unknown Object (File)
Sun, Oct 5, 11:07 PM
Unknown Object (File)
Sun, Oct 5, 9:15 PM
Unknown Object (File)
Thu, Oct 2, 10:55 AM
Subscribers
None

Details

Diff Detail

Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 10740
Build 11136: arc lint + arc unit

Event Timeline

The main question is perhaps asked in the UCB copyright comment.

sys/compat/linfdescfs/linfdesc.h
6 ↗(On Diff #30327)

Was it ? Also te UCB copyright assignment looks doubtful.

So globally this looks like a copy/paste of fdescfs(5). Why it is needed ? Please explain where the behavior of the new fs differs from native fdescfs. Then I will ask another question.

37 ↗(On Diff #30327)

Why this protection is needed ?

Also, there is no double-include protection.

sys/compat/linfdescfs/linfdesc_vfsops.c
72 ↗(On Diff #30327)

Blank line after { is needed.

return (kernel_mount());
101 ↗(On Diff #30327)

Why this cast is needed ?

125 ↗(On Diff #30327)

Remove init from declaration, collapse two int vars declarations into single.

151 ↗(On Diff #30327)

What is the purpose of taking the lock there ?

239 ↗(On Diff #30327)

Why this dependency is needed ?

sys/compat/linfdescfs/linfdesc_vnops.c
374 ↗(On Diff #30327)

Can you explain this ?

Fixed most of the comments.

Reply to the main question. Yes, this code copied from FreeBSD fdescfs code.
Unlike FreeBSD, the Linux fdesc fs is a directory containing a symbolic links to the actual files, which the process has open.
A readlink(2) call on this file returns a full path in case of regular file or a string in a special format (type:[inode], anon_inode:<file-type>, etc..)
As well as in a FreeBSD, opening the file in the Linux fdescfs directory is equivalent to duplicating the corresponding file descriptor.

Here we have mutually exclusive requirement:
in case of readlink(2) Linux fdescfsw lookup() method should return VLNK vnode otherwise our kern_readlink() fail with EINVAL error.
In other calls Linux fdescfsw lookup() method should return non VLNK vnode.

For that I added new Linux thread flag LINUX_TD_READLINK, set it before calling kern_readlinkat() and in lookup() mthod return VLNK vnode if this flag is set.

dchagin added inline comments.
sys/compat/linfdescfs/linfdesc.h
6 ↗(On Diff #30327)

Replaced by my copyright, is this good?

Fixed most of the comments.

Reply to the main question. Yes, this code copied from FreeBSD fdescfs code.
Unlike FreeBSD, the Linux fdesc fs is a directory containing a symbolic links to the actual files, which the process has open.
A readlink(2) call on this file returns a full path in case of regular file or a string in a special format (type:[inode], anon_inode:<file-type>, etc..)
As well as in a FreeBSD, opening the file in the Linux fdescfs directory is equivalent to duplicating the corresponding file descriptor.

Here we have mutually exclusive requirement:
in case of readlink(2) Linux fdescfsw lookup() method should return VLNK vnode otherwise our kern_readlink() fail with EINVAL error.
In other calls Linux fdescfsw lookup() method should return non VLNK vnode.

For that I added new Linux thread flag LINUX_TD_READLINK, set it before calling kern_readlinkat() and in lookup() mthod return VLNK vnode if this flag is set.

Why don't you add a mount option to fdescfs to cause its VOP_READLINK() to always return what linux ABI expects ? Such fdescfs would be mounted under /compat/linux/dev/fd.

That means that LINUX_TD_READLINK is not needed at all.

sys/compat/linfdescfs/linfdesc.h
6 ↗(On Diff #30327)

Probably no, you should added your copyright to the existing statement, but I think that this is moot.

Modify native fdescfs to emulate Linux(4) ABI compatibility instead of writing separate Linux fdescfs

Use v_vflag instead of v_vtype to store READLINK flag to avoid switch(v->v_vtype)

sys/fs/fdescfs/fdesc_vnops.c
371

fdesc_allocvp() is much more suitable place for the flag assignment.

You need to filter out Froot fntype when setting the flag, otherwise panic() in readlink would be called.

431

Why don't you use vnode v_vflag there ? It would avoid two pointers indirections. I.e. I would write this fragment as

vap->va_type = (vp->v_vflag & VV_LINRDLNK) != 0 ? VLNK : VCHR;
543

Same there.

587

Do not put initialization into declaration block.

596

fdesc_readlink

600

At this moment, the vn vnode might be reclaimed after which vn->v_data points to freed memory.

622

Use strlen().

sys/sys/vnode.h
253

I suggest the VV_READLINK name.

dchagin added inline comments.
sys/fs/fdescfs/fdesc_vnops.c
543

VV_READLINK only for Fdesc node

sys/fs/fdescfs/fdesc_vnops.c
543

I would prefer this assignment written with the ?:.

602

This cannot work as well. The process filedesc table is locked inside fget_cap(), but the thread already owns the vnode lock. So you get LOR there. Did you run this under WITNESS ?

Arguably, you need to read fd->fd_fd into a local variable and _then_ drop the vnode lock.

609

This vref() is useless, while fp is refed, vnode cannot be freed since fp also vrefed the vnode.

610

Again, vn_fullpath() cannot be called with a vnode locked, since it might need to lock some vnodes during the patch reconstruction, and this would break the vnode locks order.

ugh, drop vnode lock before fget_cap() and vn_fullpath()

sys/fs/fdescfs/fdesc_vnops.c
263

I still do not understand why don't you move this check and flag set to fdesc_allocvp().

Move VV_READLINK flag set to the fdesc_allocvp()

dchagin added inline comments.
sys/fs/fdescfs/fdesc_vnops.c
263

Mostly due to two ways of success out from fdesc_allocvp(), one of them I do not understand how to test

sys/fs/fdescfs/fdesc_vnops.c
245

I strongly prefer to have the vnode fully initialized before pushing it into the place where it can be found by other threads, even while it is locked.

There is clearly preferrable place where fs-private data for the vnode is constructed.

Fix second return from fdesc_allovp()

sys/fs/fdescfs/fdesc_vnops.c
245

No, please re-read what I wrote.

Move the flag setting to line 200 (patched), where all other initialization of the fs-specific data is performed.

ah, sorry, done. also fdesc_readdir() changed

This revision is now accepted and ready to land.Jul 30 2017, 5:20 PM