Page MenuHomeFreeBSD

rtld: add rtld_{get,set}_var
ClosedPublic

Authored by kib on Oct 31 2024, 6:07 AM.
Tags
None
Referenced Files
F106609765: D47351.diff
Thu, Jan 2, 4:45 PM
Unknown Object (File)
Thu, Dec 26, 8:11 PM
Unknown Object (File)
Wed, Dec 25, 8:08 PM
Unknown Object (File)
Tue, Dec 24, 7:47 PM
Unknown Object (File)
Thu, Dec 19, 7:06 AM
Unknown Object (File)
Sat, Dec 14, 1:28 PM
Unknown Object (File)
Mon, Dec 9, 4:28 AM
Unknown Object (File)
Sat, Dec 7, 12:14 AM

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

kib requested review of this revision.Oct 31 2024, 6:07 AM
kib edited the summary of this revision. (Show Details)

Fix build.
Fix libdl.

brooks added inline comments.
lib/libdl/Symbol.map
25

Missing newline should be fixed.

libexec/rtld-elf/rtld.c
356

Given most of the second and third values are false, I wonder if it makes sense to do something like this and only spell out the exceptions

https://github.com/CTSRD-CHERI/cheribsd/blob/b2ad856aac657e26a9117ea6662289575e6eb514/bin/cheribsdtest/cheribsdtest.h#L171

You could then have the first two entries look like:

	LD_ENV_DESC(BIND_NOW),
	LD_ENV_DESC(PRELOAD, .unsecure = true),

It's a bit less function-like so maybe too ugly.

6364

Given that repeated updated would leak val allocators, do we want to prevent that from happening or just let it happen?

kib marked 3 inline comments as done.Oct 31 2024, 7:09 PM
kib added inline comments.
libexec/rtld-elf/rtld.c
356

IMO it is not worth the trickery there.
Might be we should switch to flags.

Or better, to 1-bit bools, as is used elsewhere in rtld already.

6364

I intended to leak.

But now I think that rtld needs to own the value, to allow the caller to be e.g. unloaded or make the passed value disappear in some way.

kib marked 2 inline comments as done.

Take ownership of the newly assigned ld control envs.

1 bit is enough for bools

This revision is now accepted and ready to land.Oct 31 2024, 7:20 PM
libexec/rtld-elf/rtld.c
353

This is fine, but using a bitfield doesn't actually change the size of the structure and makes manipulation more expensive.

kib marked an inline comment as done.Oct 31 2024, 8:17 PM
kib added inline comments.
libexec/rtld-elf/rtld.c
353

It should be same as flags.

kib marked an inline comment as done.

Add man page

This revision now requires review to proceed.Oct 31 2024, 8:17 PM

Cross-reference in rtld.1

Are there any security implications? Should we see if there's an opportunity in glibc/musl to provide the same interface?

Also check the URLs in your commits - I looked your repo and one referenced D4735 (missing final digit).

lib/libc/gen/rtld_get_effective_env_var.3
52 ↗(On Diff #145811)

Minor grammar edits. Also maybe cannot change them rather than access.

64 ↗(On Diff #145811)
89 ↗(On Diff #145811)
90 ↗(On Diff #145811)
98 ↗(On Diff #145811)

extraneous , ?

Just a few minor nits on the roff syntax reported by mandoc -Tlint. Thanks for tagging @manpages!

lib/libc/gen/rtld_get_effective_env_var.3
29 ↗(On Diff #145811)

Dt's are always all caps and if not produces a linter error.

32 ↗(On Diff #145811)

This is the spec, although I kinda don't like doing this because the page gets really cluttered, and apropos results become nonintuitive after a while.

45 ↗(On Diff #145811)

Although, you could also say:
Xr ld-elf.so.1 1
But I think the suggested is a little cleaner. Xrefs are broken if they dont have a space and a section number after them.

67 ↗(On Diff #145811)

Or Pq see Xr ld-elf.so.1 1

But since the page is called rtld(1), I think it reads cleaner to use the suggestion.

The Pq creates parenthesis. You can use that to write it really cleanly if the xref comes last in the parenthetical.

libexec/rtld-elf/rtld.1
534 ↗(On Diff #145811)

Since these both go to the same page, I'd recomend using only one of them.

This revision now requires changes to proceed.Nov 1 2024, 1:28 PM
kib marked 10 inline comments as done.Nov 1 2024, 3:45 PM

Are there any security implications?

I do not believe so. You need to have a code in the process to be able to call the functions, and then you might as well open-code what rtld does.
'It rather involved being on the other side of the airtight hatchway'.

Also, there is a check for setuid/setgid and then rtld refuses to change even white-listed variables. This might be an overkill.

Should we see if there's an opportunity in glibc/musl to provide the same interface?

I am not sure how to answer this.

Also check the URLs in your commits - I looked your repo and one referenced D4735 (missing final digit).

Fixed.

lib/libc/gen/rtld_get_effective_env_var.3
52 ↗(On Diff #145811)

Linker does not need to change env vars, it is code that is served by rtld which changes them in hope for rtld to access changed values and modify the behavior accordingly.

kib marked an inline comment as done.

Man page edits

Should we see if there's an opportunity in glibc/musl to provide the same interface?

Put another way, should we encourage other rtld implementations to adopt this interface (or simply make them aware of it)?

lib/libc/gen/rtld_get_effective_env_var.3
52 ↗(On Diff #145811)

It's the connection between "process environment variables" and "access" that gives me pause -- as written it seems to suggest that it is not possible to access environment variables after a process starts.

Possibly something like runtime linker settings are initialized based on environment variables at process start?

Manpage syntax LGTM. I'll leave everything else to SMEs.

This revision is now accepted and ready to land.Nov 1 2024, 4:29 PM

Should we see if there's an opportunity in glibc/musl to provide the same interface?

Put another way, should we encourage other rtld implementations to adopt this interface (or simply make them aware of it)?

Perhaps make them aware part, but I do not want to post to libc list with this.

lib/libc/gen/rtld_get_effective_env_var.3
52 ↗(On Diff #145811)

It's the connection between "process environment variables" and "access" that gives me pause -- as written it seems to suggest that it is not possible to access environment variables after a process starts.

Exactly. It is up to the userspace code to maintain the environment, and to select a way to do that. For kernel, environment exists only in the moment of execve(2). For rtld, it can get a guaranteed access to the environment (as passed by kernel) by looking at the psargs structure on stack.

But after runtime took over, environment is no longer in a place known to rtld. E.g. libc does not put modified environment back into the psarg AFAIR. Other language runtimes might do something very different, e.g. use native string representation. The only requirement for them is to supply proper env for execve(2).

Possibly something like runtime linker settings are initialized based on environment variables at process start?

Looks good. One final thought on the interface names - how about rtld_get_var and rtld_set_var? My thought is they're not quite env vars based on the discussion we had above and also the fact that the LD_ prefix is not part of the names. So what if we just consider them rtld variables initialized from the environment?

lib/libc/gen/rtld_get_effective_env_var.3
52 ↗(On Diff #145811)

I guess I read "process environment variables" as "variables that a process can read and write using getenv or setenv" and conflating that with psargs used to initialize the process' view of the environment variables. Writing that out I see how your text is correct.

This text solves the confusion I had, if you think something like this is useful:

The run-time linker is able to access the environment provided at process startup. After startup, environment variables are maintained by higher-level libraries and are not accessible by the run-time linker. Thus it is not possible to change linker behavior by setting environment variables.

(Perhaps omitting the final sentence.)

Use vararg trick for filling env var description array.
Allow dynamically tweaking LD_DEBUG.
Allow to unset a variable.

This revision now requires review to proceed.Nov 1 2024, 7:28 PM
kib retitled this revision from rtld: add rtld_{get,set}_effective_env_var to rtld: add rtld_{get,set}_var.

Add explanatory text about env vars into rtld.1.
Rename rtld_get_effective env_var to rtld_get_var, same with set.

vini.ipsmaker_gmail.com added inline comments.
libexec/rtld-elf/rtld.c
6364

FWIW, glibc's putenv() does not copy the string for the following reasons:

Since glibc 2.1.2, the glibc implementation conforms to SUSv2: the pointer string given to putenv() is used. In particular, this string becomes part of the environment; changing it later will change the environment. (Thus, it is an error to call putenv() with an automatic variable as the argument, then return from the calling function while string is still part of the environment.)

In a way, that is the current behavior for rtld already. In my project, I fork() and then clear the whole environment from the child by calling explicit_bzero() for all env vars. As a side-effect, the previous value for LD_LIBRARY_PATH_FDS no longer works.

That's just a little remark though. rtld_set_var() is a new API and I can make it work with my project for any choice here.