To ease portability with macOs and NetBSD, adding F_GETPATH which uses the new introduced F_KINFO codepath but just returning one field instead.
Details
Diff Detail
- Lint
Lint Skipped - Unit
Tests Skipped
Event Timeline
lib/libc/sys/fcntl.2 | ||
---|---|---|
31 | Bump date | |
215 | I think it is better to formulate it in some form like 'Buffer size must be at least of PATH_MAX bytes' Also need to say that arg is char *. | |
lib/libc/sys/fcntl.c | ||
53 | What is the point of this rearrangement? | |
73 | This is rather pointless, but see below. | |
sys/kern/kern_descrip.c | ||
860 ↗ | (On Diff #100276) | I do not think that we need to support this in kernel. Make libc wrapper handle it, by calling F_KINFO and then copying kf_path to the user buffer. |
sys/sys/fcntl.h | ||
274 | <TAB> after #define |
As I explained elsewhere F_GETPATH is a questionable idea as it provides possibly unexpected result in presence of hardlinks and may end up failing to produce anything to begin with.
However, if it is to be added, it should look roughly how I implemented it years ago, rebased now: https://people.freebsd.org/~mjg/F_GETPATH.diff
F_KINFO performs tons of work not needed here, for example VOP_GETATTR.
Really, no. It would only work for vnodes, and I suspect that if this functionality is useful, one of their 'big' uses would be non-vnodes file descriptors, like pts and shm.
F_KINFO performs tons of work not needed here, for example VOP_GETATTR.
Which does not matter, arguably, because it is:
- compat with other BSDs (but this should be articulated in the man page, indeed)
- pointless to use on the fast path, at least I cannot imagine how could it be used except for some management or user presentation parts.
lib/libc/sys/fcntl.c | ||
---|---|---|
52 | Still, why reordering? | |
58 | char *buf = | |
61 | We usually call this variable error. Also, it might be cleaner to just call fcntl(fd, F_KINFO, &kif). I am not sure. | |
66 | error = fcntl(fd, F_KINFO, &kif); if (error == 0) strcpy(); return (error); |
lib/libc/sys/fcntl.2 | ||
---|---|---|
31 | Still not done. | |
219 | I believe that points that mjg made should be handled somehow. In particular, man page should note that:
The second point perhaps could be made in F_KINFO description. |
See below. I also have to note both Darwin and NetBSD explicitly only support vnodes.
F_KINFO performs tons of work not needed here, for example VOP_GETATTR.
Which does not matter, arguably, because it is:
- compat with other BSDs (but this should be articulated in the man page, indeed)
- pointless to use on the fast path, at least I cannot imagine how could it be used except for some management or user presentation parts.
As I pointed out last time F_GETPATH came up, it is used by clang if available instead of realpath (and it is being used a lot). Usage was implemented with Darwin in mind:
#if defined(F_GETPATH) // When F_GETPATH is availble, it is the quickest way to get // the real path name. char Buffer[PATH_MAX]; if (::fcntl(ResultFD, F_GETPATH, Buffer) != -1) RealPath->append(Buffer, Buffer + strlen(Buffer));
Tracing this with:
dtrace -n 'fbt::sys___realpathat:entry { self->buf = args[1]->buf; } fbt::sys___realpathat:return /self->buf/ { @[copyinstr((uintptr_t)self->buf)] = count(); self->buf = 0; }
while running buildkernel produces:
[snip] /usr/src/sys/sys/_timespec.h 5927 /usr/src/sys/sys/_timeval.h 5927 /usr/src/sys/sys/_types.h 5927 /usr/src/sys/sys/select.h 5927 /usr/src/sys/sys/timespec.h 5927 /usr/src/sys/sys/types.h 5927 /usr/src/sys/x86/include/_limits.h 5927 /usr/src/sys/x86/include/_types.h 5927 /usr/src/sys/x86/include/endian.h 5927 /usr/src/sys/sys/cdefs.h 6030 /usr/obj/usr/src/amd64.amd64/sys/GENERIC/opt_global.h 6047 /usr/bin/cc 12126
This scales reasonably well with realpath (and can be further improved to scale perfectly). F_KINFO has similar work to do *and* adds more atomics including locking/unlocking the target because of VOP_GETATTR. It other words it is going to regress performance to some extent. In contrast just resolving the path name is with F_GETPATH is going to work without any scalability hindrance as long as passed fd is not shared.
Ultimately this is not BSD compat, but a feature provided by Darwin and probably used in more cases than just clang as a cheap alternative to realpath (except with the proposed patch it would be more expensive).
So I had another look and I'm confident the hardlink situation would pose a security problem -- by default unprivileged users can hardlink to anything they want. If a privileged program uses the F_GETPATH result it may end up falling victom to TOCTOU, as the attacker may be able to provoke their path to be returned and replaced with something else before it can be accessed.
That said, until someone implements tracking of the path used to open the file, this looks like a can of worms.
So how this is relevant? You are objecting about buggy use of the API that is, by design has such problem. It does not matter if we ever consistent with hardlink, or not.
Also, I do not understand what do you mean saying that unprivileged user can hardlink anything he want. The rules are known, and user can hardlink anything everywhere when rules allow it. i.e. at least the target directory must be writeable. If this is broken, it is often trivial play with libraries that already owns everything. Basically this is why hardlink_check_{u/g}id are not enabled by default: they add nothing.
I also looked at NetBSD and tried to understand what MacOSX does. For NetBSD, the text in the man page https://man.netbsd.org/fcntl.2 definitely sounds like an excuse and not as a specification. They would implement better if they can. Also they do not consider 'security' with hardink non-canonical.
For MacOSX, I do not see any special note about F_GETPATH, they just claim that it returns the path. There is no note about hardlinks (not sure is this an omission, a consequence of the different namecache design, or the fact that HFS/AFS really do not support hardlink properly). I do not want to dig into Darwin sources.
So the only objection you have that I have some agreement with is that potentially llvm openFileForRead() _might_ be slowed down. I just looked at the code and it does unconditional access(2) on /proc/self/fd/%d before doing realpath, so I am even skeptical that F_KINFO would be problematic. But, if it is, why did not you bothered with clearing out /proc use for FreeBSD, and unconditionally falling to realpath?
That said, until someone implements tracking of the path used to open the file, this looks like a can of worms.
As explained above, I do not agree (and NetBSD does not agree either).
It is not by design, it is an artifact of how name caching has been originally implemented in BSDs. Linux has dentry cache which tracks what you opened and can always find the right link, even across renames making /proc/pid/fd/$fd reliable. DragonflyBSD is doing a similar thing and their F_GETPATH also does not suffer the problem.
Also, I do not understand what do you mean saying that unprivileged user can hardlink anything he want. The rules are known, and user can hardlink anything everywhere when rules allow it. i.e. at least the target directory must be writeable. If this is broken, it is often trivial play with libraries that already owns everything. Basically this is why hardlink_check_{u/g}id are not enabled by default: they add nothing.
The question is why would you want to call F_GETPATH in the first place and if that is safe to do given the hardlink problem.
So here is an example scenario:
$ ln $ROOT_OWNED_755_DIR/$ROOT_OWNED_600_FILE $BADUSER/foo
Now a privileged program opens $ROOT_OWNED_755_DIR/$ROOT_OWNED_600_FILE, tinkers with it, calls F_GETPATH and gets $BADUSER/foo. This wont be an expected result by people using the feature and there is nothing said program can do to prevent the problem. What safe things can they possibly do in face of such a result?
People who hear about the feature think it is an equivalent to readlink on /proc/self/fd/ on Linux, but given the above, it clearly is not.
I also looked at NetBSD and tried to understand what MacOSX does. For NetBSD, the text in the man page https://man.netbsd.org/fcntl.2 definitely sounds like an excuse and not as a specification. They would implement better if they can. Also they do not consider 'security' with hardink non-canonical.
For MacOSX, I do not see any special note about F_GETPATH, they just claim that it returns the path. There is no note about hardlinks (not sure is this an omission, a consequence of the different namecache design, or the fact that HFS/AFS really do not support hardlink properly). I do not want to dig into Darwin sources.
I don't know what they are doing there, I strongly suspect either hardlinking of the sort is disallowed or the issue was not considered to begin with.
So the only objection you have that I have some agreement with is that potentially llvm openFileForRead() _might_ be slowed down. I just looked at the code and it does unconditional access(2) on /proc/self/fd/%d before doing realpath, so I am even skeptical that F_KINFO would be problematic. But, if it is, why did not you bothered with clearing out /proc use for FreeBSD, and unconditionally falling to realpath?
Clang is doing tons of Linux-specific lookups on FreeBSD and cleaning that up is on my TODO list. It's not also not at argument here -- should this get cleaned up, the extra overhead from a F_KINFO-based implementation will remain.
That said, until someone implements tracking of the path used to open the file, this looks like a can of worms.
As explained above, I do not agree (and NetBSD does not agree either).
NetBSD not agreeing is not an argument.
Of course it is by design. The answer provided by any such call, be it F_GETPATH, /proc/self/fd/N, F_KINFO, is outdated right at the moment it is calculated. The wrong link is a small detail in this whole picture is only a small detail.
I am aware of only one case where right hard link is indeed important, and there it is somewhat unrelated to the discussion. It is for the AT_EXECPATH/kern.proc.pathname, where in fact the link is used to determine the behavior of multi-named binary. This case certainly has nothing to do with F_GETPATH.
Also, I do not understand what do you mean saying that unprivileged user can hardlink anything he want. The rules are known, and user can hardlink anything everywhere when rules allow it. i.e. at least the target directory must be writeable. If this is broken, it is often trivial play with libraries that already owns everything. Basically this is why hardlink_check_{u/g}id are not enabled by default: they add nothing.
The question is why would you want to call F_GETPATH in the first place and if that is safe to do given the hardlink problem.
So here is an example scenario:
$ ln $ROOT_OWNED_755_DIR/$ROOT_OWNED_600_FILE $BADUSER/foo
Now a privileged program opens $ROOT_OWNED_755_DIR/$ROOT_OWNED_600_FILE, tinkers with it, calls F_GETPATH and gets $BADUSER/foo. This wont be an expected result by people using the feature and there is nothing said program can do to prevent the problem. What safe things can they possibly do in face of such a result?
What things they intend to do at all, that need F_GETPATH? And how do they intent to validate the F_GETPATH result anyway?
For instance, rtld working with DT_NEEDED/DT_SONAME rechecks for (dev_t, ino_t) of any opened dso. Even if it (tried to) use F_GETPATH to re-resolve dso full path from opened file descriptor, e.g. for fdlopen(3), it has to validate the answer somehow.
People who hear about the feature think it is an equivalent to readlink on /proc/self/fd/ on Linux, but given the above, it clearly is not.
I also looked at NetBSD and tried to understand what MacOSX does. For NetBSD, the text in the man page https://man.netbsd.org/fcntl.2 definitely sounds like an excuse and not as a specification. They would implement better if they can. Also they do not consider 'security' with hardink non-canonical.
For MacOSX, I do not see any special note about F_GETPATH, they just claim that it returns the path. There is no note about hardlinks (not sure is this an omission, a consequence of the different namecache design, or the fact that HFS/AFS really do not support hardlink properly). I do not want to dig into Darwin sources.
I don't know what they are doing there, I strongly suspect either hardlinking of the sort is disallowed or the issue was not considered to begin with.
So the only objection you have that I have some agreement with is that potentially llvm openFileForRead() _might_ be slowed down. I just looked at the code and it does unconditional access(2) on /proc/self/fd/%d before doing realpath, so I am even skeptical that F_KINFO would be problematic. But, if it is, why did not you bothered with clearing out /proc use for FreeBSD, and unconditionally falling to realpath?
Clang is doing tons of Linux-specific lookups on FreeBSD and cleaning that up is on my TODO list. It's not also not at argument here -- should this get cleaned up, the extra overhead from a F_KINFO-based implementation will remain.
That said, until someone implements tracking of the path used to open the file, this looks like a can of worms.
As explained above, I do not agree (and NetBSD does not agree either).
NetBSD not agreeing is not an argument.
This discussion contains no technical arguments, only opinions, and NetBSD opinion weight a lot in it.
With that kind of approach any path-based argument is already useless.
If the user does not mess with their directory tree (nor does root), it stands to reason the path names are stable. A user which can't modify the path name and yet can influencing the outcome of F_GETPATH *is* a security threat.
I am aware of only one case where right hard link is indeed important, and there it is somewhat unrelated to the discussion. It is for the AT_EXECPATH/kern.proc.pathname, where in fact the link is used to determine the behavior of multi-named binary. This case certainly has nothing to do with F_GETPATH.
Also, I do not understand what do you mean saying that unprivileged user can hardlink anything he want. The rules are known, and user can hardlink anything everywhere when rules allow it. i.e. at least the target directory must be writeable. If this is broken, it is often trivial play with libraries that already owns everything. Basically this is why hardlink_check_{u/g}id are not enabled by default: they add nothing.
So for example clang is doing F_GETPATH on various files as it is considered to be faster than realpath, most notably it is resolving headers from /usr/include. If an unprivileged user will be allowed to hardlink these, clang can get bogus results here. I don't know if this on its own already causes a significant problem, but it does illustrate what I mean.
I had someone using OS X check and that system prevents hardlinking to files owned by other people. I don't know how that came to be, but the end result is that the attack vector I mentioned here is not a problem on their platform.
The question is why would you want to call F_GETPATH in the first place and if that is safe to do given the hardlink problem.
So here is an example scenario:
$ ln $ROOT_OWNED_755_DIR/$ROOT_OWNED_600_FILE $BADUSER/foo
Now a privileged program opens $ROOT_OWNED_755_DIR/$ROOT_OWNED_600_FILE, tinkers with it, calls F_GETPATH and gets $BADUSER/foo. This wont be an expected result by people using the feature and there is nothing said program can do to prevent the problem. What safe things can they possibly do in face of such a result?
What things they intend to do at all, that need F_GETPATH? And how do they intent to validate the F_GETPATH result anyway?
For instance, rtld working with DT_NEEDED/DT_SONAME rechecks for (dev_t, ino_t) of any opened dso. Even if it (tried to) use F_GETPATH to re-resolve dso full path from opened file descriptor, e.g. for fdlopen(3), it has to validate the answer somehow.
People who hear about the feature think it is an equivalent to readlink on /proc/self/fd/ on Linux, but given the above, it clearly is not.
I also looked at NetBSD and tried to understand what MacOSX does. For NetBSD, the text in the man page https://man.netbsd.org/fcntl.2 definitely sounds like an excuse and not as a specification. They would implement better if they can. Also they do not consider 'security' with hardink non-canonical.
For MacOSX, I do not see any special note about F_GETPATH, they just claim that it returns the path. There is no note about hardlinks (not sure is this an omission, a consequence of the different namecache design, or the fact that HFS/AFS really do not support hardlink properly). I do not want to dig into Darwin sources.
I don't know what they are doing there, I strongly suspect either hardlinking of the sort is disallowed or the issue was not considered to begin with.
So the only objection you have that I have some agreement with is that potentially llvm openFileForRead() _might_ be slowed down. I just looked at the code and it does unconditional access(2) on /proc/self/fd/%d before doing realpath, so I am even skeptical that F_KINFO would be problematic. But, if it is, why did not you bothered with clearing out /proc use for FreeBSD, and unconditionally falling to realpath?
Clang is doing tons of Linux-specific lookups on FreeBSD and cleaning that up is on my TODO list. It's not also not at argument here -- should this get cleaned up, the extra overhead from a F_KINFO-based implementation will remain.
That said, until someone implements tracking of the path used to open the file, this looks like a can of worms.
As explained above, I do not agree (and NetBSD does not agree either).
NetBSD not agreeing is not an argument.
This discussion contains no technical arguments, only opinions, and NetBSD opinion weight a lot in it.