Starting with Linux kernel v5.10 we have added support to mark a range of file descriptors as close-on-exec without actually closing them. This is useful in general and has been on the wishlist of some people for quite some time.
I'd like to keep the close_range() implementation on FreeBSD and Linux as close together as possible so the name of the new flag CLOSE_RANGE_CLOEXEC and it's value (1U << 2) is identical to that used on Linux.
Details
I probably need some initial guidance on how to add tests where.
Diff Detail
- Repository
- rS FreeBSD src repository - subversion
- Lint
Lint Skipped - Unit
Tests Skipped
Event Timeline
sys/sys/unistd.h | ||
---|---|---|
164 | I'm not sure where this flag should best be defined. :) |
sys/sys/unistd.h | ||
---|---|---|
164 | Somehow me adding this comment has caused the revision to be abandoned which is a bit surprising. Just note that I didn't intend this. :) |
The test situation here is kinda ~meh; we have a blend of ATF/TAP tests. This in particular should be added to the closefrom TAP test @ ^/tests/sys/file/closefrom_test.c.
sys/kern/kern_descrip.c | ||
---|---|---|
1361 | It's tempting to say we should go ahead and just inline the above kern_cloexec() into kern_close_range; this SLOCK and the paired SUNLOCK upon exit can be x{,un}lock instead if (flags & CLOSE_RANGE_CLOEXEC) != 0, then the loop just needs minor restructuring: fdp->fd_ofiles[fd] is already the filedescent you need to operate on and you already have the xlock for it to boot. | |
1410 | Style nit: this and the above conditional in kern_close_range() should be explicit comparisons against 0 | |
sys/sys/unistd.h | ||
164 | Odd! This seems to be the ideal spot for the definition to go. |
sys/kern/kern_descrip.c | ||
---|---|---|
1342 | kern_* namespace is used for syscall helpers, namely to the functions that implement syscalls after ABI details are handled. But I am not sure why do you need this function at all, I agree with kevans. | |
sys/sys/unistd.h | ||
164 | It is weird to place CLOSE_RANGE_CLOEXEC define in the middle of _PC_ series. I would place it right before #endif /* __BSD_VISIBLE */ line at the end of the file. |
The current close range code needs some cleanup regardless of this change. I think it would be most time efficient if exact semantics got documented and then someone from FreeBSD side implemented it (I'm happy to do it). Unless you really want to spend time on this, in which case I'm happy to review.
I'm happy to do it. I'm likely to contribute patches in the future so it'd be pretty helpful to get some experience writing patches for FreeBSD.
Glad to hear it. In that case I'll comment on the patch.
The current close_range suffers from a lot of avoidable filedesc relocking, I have a patch to fix it. Even then actual close drops the filedesc lock, so some degree of relock dance will still be there.
In contrast the feature you are implementing here does not need any of it. It can just take the filedesc lock upfront and have a for loop over the content. You can branch on it after the range got determined. iow in my opinion this just have a for loop doing fdeget_locked. I don't know the Linux code, it may be divorcing the 2 loops is very different from what was done there, but will very likely be the easiest way forward including any other future ops.
FWIW, I think I found the first candidate we should smack with this flag:
49858 tshark 2.915694 CALL fcntl(0x3526,F_SETFD,FD_CLOEXEC) 49858 tshark 2.915727 RET fcntl -1 errno 9 Bad file descriptor 49858 tshark 2.915760 CALL fcntl(0x3527,F_SETFD,FD_CLOEXEC) 49858 tshark 2.915794 RET fcntl -1 errno 9 Bad file descriptor 49858 tshark 2.915828 CALL fcntl(0x3528,F_SETFD,FD_CLOEXEC) 49858 tshark 2.915861 RET fcntl -1 errno 9 Bad file descriptor 49858 tshark 2.915894 CALL fcntl(0x3529,F_SETFD,FD_CLOEXEC) 49858 tshark 2.915928 RET fcntl -1 errno 9 Bad file descriptor 49858 tshark 2.915962 CALL fcntl(0x352a,F_SETFD,FD_CLOEXEC) 49858 tshark 2.915995 RET fcntl -1 errno 9 Bad file descriptor 49859 tshark 2.914993 CALL fcntl(0x3476,F_SETFD,FD_CLOEXEC) 49859 tshark 2.916066 RET fcntl -1 errno 9 Bad file descriptor 49858 tshark 2.916029 CALL fcntl(0x352b,F_SETFD,FD_CLOEXEC) 49858 tshark 2.916134 RET fcntl -1 errno 9 Bad file descriptor 49858 tshark 2.916167 CALL fcntl(0x352c,F_SETFD,FD_CLOEXEC) 49858 tshark 2.916201 RET fcntl -1 errno 9 Bad file descriptor