Page MenuHomeFreeBSD

vfs: add kernel-side realpathat
ClosedPublic

Authored by mjg on Feb 7 2020, 7:05 PM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Jan 3, 3:06 PM
Unknown Object (File)
Sun, Dec 29, 11:47 PM
Unknown Object (File)
Wed, Dec 18, 2:44 PM
Unknown Object (File)
Sat, Dec 14, 2:12 AM
Unknown Object (File)
Tue, Dec 10, 6:28 PM
Unknown Object (File)
Dec 4 2024, 7:42 PM
Unknown Object (File)
Nov 23 2024, 9:44 PM
Unknown Object (File)
Nov 20 2024, 4:38 AM

Details

Summary

realpath(3) is a major source of fstatat calls e.g., during buildkernel (used heavily by clang).

Moving realpath into the kernel removes most stats and getcwd calls (see below for example test result).

This works by performing a regular lookup while saving the name and found parent directory. If the terminal vnode is a directory we can resolve it using usual means. Otherwise we can use the name saved by lookup and resolve the parent.

The patch is almost complete (may use some comments). truss correctly dumps data, but ktrace only dumps one buffer and not the other one. I have not checked yet how to sort that out, but I don't think it should be a blocker.

before:

__getcwd                                                      66065
mprotect                                                      72233
open                                                          94032
sigaction                                                    111967
write                                                        133762
pread                                                        195676
openat                                                       281137
fstat                                                        318231
close                                                        325406
sigprocmask                                                  718695
fstatat                                                     1423441
mmap                                                        3259492

after:

mprotect                                                      72233
open                                                          94032
sigaction                                                    111967
write                                                        133761
fstatat                                                      148846
pread                                                        195676
__realpath                                                   225900
openat                                                       281137
fstat                                                        318231
close                                                        325404
sigprocmask                                                  718699
mmap                                                        3260705

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

lib/libc/sys/Symbol.map
405 ↗(On Diff #67952)

Does this actually need to be exported?

sys/kern/syscalls.master
3218 ↗(On Diff #67952)
_Out_writes_z_(size) char *buf,
  • fix syscalls.master entry
  • few nits
lib/libc/sys/Symbol.map
405 ↗(On Diff #67952)

I'm just following __getcwd. I have no opinion otherwise.

lib/libc/sys/Symbol.map
405 ↗(On Diff #67952)

The __getcwd export was probably a mistake (we exported WAY too much stuff in the first cut and that wasn't fixed...)

lib/libc/sys/Symbol.map
405 ↗(On Diff #67952)

I support this, sys__realpath() should be enough for libc use.

sys/kern/vfs_cache.c
2230 ↗(On Diff #67956)

This blank line is excessive.

2241 ↗(On Diff #67956)

And this too.

2375 ↗(On Diff #67956)

The function lacks a herald comment explaining it. Asserts are good but not enough.

2475 ↗(On Diff #67956)

Again no herald comment.

2479 ↗(On Diff #67956)

Make slash_prefixed bool, in all places ?

2485 ↗(On Diff #67956)

Don't you need to check that there is no underflow ? If you rely on the caller providing enough initial length, then this should be asserted instead.

  • unexport __realpath
  • convert slash_prexied to bool
  • add more comments
  • validate length in vn_fullpath_any
  • slash_prefixed and whitespace cosmetics
  • remove wrong Symbol.map entry

For some reason libc started to fail to compile for me (regardless of the state of Symbol.map):

ld: error: can't create dynamic relocation R_X86_64_32 against symbol: SYS___realpath in reado
nly segment; recompile object files with -fPIC or pass '-Wl,-z,notext' to allow text relocatio
ns in the output
>>> defined in __realpath.pico
>>> referenced by __realpath.S:4
>>>               __realpath.pico:(___realpath)
cc: error: linker command failed with exit code 1 (use -v to see invocation)
*** [libc.so.7.full] Error code 1

any idea what's going on here? I compared the entries with older syscalls and I don't see any differences of note. i did make sysent on top of /usr/src.

  • only set freebuf after success to match other routines
  • handle rtld-elf
  • add __getosreldate check as a temporary aid for upgrades. the value in there is to be changed

this builds and works fine for me

Can I get further review on this?

I verified that former vn_fullpath1 consumers which now call vn_fullpath_any/dir get the same results (both the content of the buffer and length returned).

realpath tests in kyua trivialy pass. one concerns the userspace implementation and is partially moot, the other a case of an empty symlink which is handled in the expected manner by lookup itself.

What I verified by hand and could be added to kyua is that hardlinks are properly handled (as in, realpath to foo is not the same as realpath to bar, even if they are the same vnode).

How jails and chroots are handled ?

lib/libc/stdlib/realpath.c
229 ↗(On Diff #68375)

You do remember about this place. Not sure if at_bsdflags would be easier.

lib/libc/sys/Symbol.map
831 ↗(On Diff #68375)

so you are still exporting this

Same way as in the code prior to the patch -- routines resolve up to a point of reaching the passed 'rdir' vnode and I'm passing the same stuff getcwd does.

lib/libc/stdlib/realpath.c
229 ↗(On Diff #68375)

This is at temporary aid for upgrading which I'm going to remove in few weeks. I had no difficulty remembering to remove equivalent compat for F_ISUNIONSTACK so I think we are fine here..

lib/libc/sys/Symbol.map
831 ↗(On Diff #68375)

I moved it to FBSDprivate_1.0 just like __sys_sigfastblock, I thought this counts as not exporting.

lib/libc/sys/Symbol.map
831 ↗(On Diff #68375)

if this is invalid what should be done here? I want to move forward with this review as it impacts some other pending work

lib/libc/sys/Symbol.map
831 ↗(On Diff #68375)

Just don't mention in in Symbol.map. Nothing outside libc needs to know it exists.

lib/libc/sys/Symbol.map
831 ↗(On Diff #68375)

I noted rtld-elf slurps in realpath(3) and consequently the new syscall

831 ↗(On Diff #68375)

... which given the sigfastblock entry I presume makes it mandatory to note here

lib/libc/sys/Symbol.map
831 ↗(On Diff #68375)

I don't think so. Rtld-elf is a "static" link. I believe __sys_sigfastblock is exported because libthr uses it.

lib/libc/sys/Symbol.map
831 ↗(On Diff #68375)

This is right, libthr uses the syscall.

sys/kern/vfs_cache.c
2231 ↗(On Diff #68375)

I do not see a reason to not make the syscall*at()-aware from the start.

2567 ↗(On Diff #68375)

So if the last component is symlink, this code just leaves the symlink in the result ? I do not think it is a right behavior for realpath(2).

You might want to add AT_SYMLINK_NOFOLLOW flag, because I can see that both cases make sense for syscall.

mjg retitled this revision from vfs: add realpath(2) to vfs: add kernel-side realpathat.
  • rename to realpathat
mjg marked an inline comment as done.Feb 20 2020, 3:09 AM
mjg added inline comments.
sys/kern/vfs_cache.c
2567 ↗(On Diff #68375)

The last component can't be a symlink because they are always resolved and if resolving fails, lookup also fails. I don't think adding the specific flag is of much use at this stage, there is the flags argument hardwired to 0 so this can always be extended later.

Ok, lets postpone AT_NOFOLLOW.

I think this is fine, and I remembered something that might make this even more useful. It may be, if AT_BENEATH plus allowing this syscall under cap mode with relative paths (with relative against dirfd result). I added Ryan who could comment more.

This revision is now accepted and ready to land.Feb 20 2020, 1:30 PM
This revision was automatically updated to reflect the committed changes.

The flags argument is there specifically so these kind of extensions are possible without adding new syscalls. The only consumer does not use them and I think adding capability-aware realpath is a separate effort. Since this partially blocks other work I just went ahead with the commit.