Allow rtld direct-exec to take a file descriptor.
ClosedPublic

Authored by jonathan on May 15 2017, 11:31 PM.

Details

Summary

When executing rtld directly, allow the value of argv[0] to specify a
file descriptor of the binary to execute. If this fails, fall back to
the previous behaviour of treating argv[0] as a path.

Given the newfound ability of rtld to be directly executed, this approach can be
used as an alternative to creating an ffexecve(2) system call.

This diff has two commits in it: one to rename parse_libdir->parse_integer
and the other to actually do the exec-an-fd work.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.
jonathan created this revision.May 15 2017, 11:31 PM
kib added a comment.May 16 2017, 8:08 AM

I do not like these argv0 tricks, I intend to implement normal options parsing for the direct mode. One of the options would take the file descriptor number and do what your trick does. I think that this is better than the trick since it allows to easily invoke and test the functionality from the shell.

That said, please commit the rename of parse_libdir now. This is the change which is definitely needed for the options parser.

libexec/rtld-elf/rtld.c
5258 ↗(On Diff #28388)

This message is lost. Wouldn't it make sense to put it in the dirfd < 0 action above ?

In D10751#222729, @kib wrote:

I do not like these argv0 tricks, I intend to implement normal options parsing for the direct mode. One of the options would take the file descriptor number and do what your trick does. I think that this is better than the trick since it allows to easily invoke and test the functionality from the shell.

You're right, that would be better. Were you thinking of getopt-style parsing (but without pulling in getopt) or environment variables (e.g., LD_BINARY_FD=xx)? The environment variable approach has the benefit of not requiring any new parsing code, but an explicit --fd xx will look more familiar. Then again, if we provide options people might come to expect --help and --usage options. :)

That said, please commit the rename of parse_libdir now. This is the change which is definitely needed for the options parser.

Ok, I'll do that this morning (currently building from a clean checkout, then I'll run the LD_LIBRARY_PATH_FDS test again).

libexec/rtld-elf/rtld.c
5258 ↗(On Diff #28388)

Yes, that would make sense... I'll update the review once I've committed the parse_integer change.

kib added a comment.May 16 2017, 12:49 PM

Were you thinking of getopt-style parsing (but without pulling in getopt) or environment variables (e.g., LD_BINARY_FD=xx)? The environment variable approach has the benefit of not requiring any new parsing code, but an explicit --fd xx will look more familiar. Then again, if we provide options people might come to expect --help and --usage options. :)

I intended to provide syntax like

ld-elf.so.1 [-ph] [-f fd] [--] binary args

-h outputs short usage summary, -p means search the $PATH paths for binary, -f fd starts binary pointed to by the filedescriptor fd, -- ends the options list.

Yes, I do not want to reference getopt(3) routine from libc, it would not work anyway, but very similar code is due.

For environment variables I agree that this is not very convenient, but my other concern is that somebody must clean the environment for the started image, since LD_BINARY_FD does not make a sense after ld.so acted, and even is detrimental. ld.so can do the censorship, but I prefer to keep the environment intact.

jonathan updated this revision to Diff 28416.May 16 2017, 1:46 PM
  • Add more explicit argument parsing to rtld.
  • Merge remote-tracking branch 'origin/master' into arcpatch-D10751
jonathan updated this revision to Diff 28417.May 16 2017, 1:47 PM
  • Remove a stray (debug) utrace(2) invocation.
jonathan updated this revision to Diff 28418.May 16 2017, 1:49 PM
jonathan marked 2 inline comments as done.
  • Apply a style(9) fix.
kib added a comment.May 16 2017, 2:23 PM

Two small notes, this is not a review. Mostly to inform you about D10750.

libexec/rtld-elf/rtld.c
429 ↗(On Diff #28418)

{} are not needed

5295 ↗(On Diff #28418)

For this, see D10750.

kib added a reviewer: emaste.May 16 2017, 2:24 PM
jonathan updated this revision to Diff 28438.May 16 2017, 10:55 PM
  • Merge 'origin/master' into arcpatch-D10751
  • Merge 'origin/master' into arcpatch-D10751

This merge pulls in the changes from D10750, dropping the now-redundant checks on executable permission bits.

jonathan updated this revision to Diff 28439.May 16 2017, 10:58 PM
  • Drop unnecessary defensive braces.
jonathan updated this revision to Diff 28441.May 17 2017, 12:02 AM
  • Fix argv/environ/aux shift calculations.
jonathan marked 2 inline comments as done.May 17 2017, 12:07 AM
kib added inline comments.May 17 2017, 1:29 PM
libexec/rtld-elf/rtld.c
5300 ↗(On Diff #28441)

This is bug, --help is matched as well.

5313 ↗(On Diff #28441)

This is bug, -fdull <n> is matched as well.

And, why do you want -fd be multi-char option, while other options are single-char ? -f <n> works fine, IMO.

5325 ↗(On Diff #28441)

I do not see why this check is needed. I understand why do you want to make sure that fd is a file descriptor at all, ie. the fstat(2) call above makes sense. But properties of the fd are not important, as far at it quacks like a suitable duck.

5337 ↗(On Diff #28441)

Why do you use strn* functions ? Their use for anything but d_name in V7 struct dirent is plain bug, like all uses in the patch.

kib added inline comments.May 17 2017, 3:36 PM
libexec/rtld-elf/rtld.c
5300 ↗(On Diff #28441)

Ok, I missed that the length is 3, so the nul symbol participates. But then use of strncmp is even less sensible.

jonathan updated this revision to Diff 28472.May 17 2017, 4:32 PM
jonathan marked 4 inline comments as done.
  • Address comments in kib's review.
jonathan marked an inline comment as done.EditedMay 17 2017, 4:36 PM

Ok, I think I've addressed all of these points now. I suppose that my use of strncmp comes from... my fingers just don't want to type functions that start with strc? :) Anyhow, I've changed all of the strncmp to strcmp and strnlen to strlen as you've asked.

kib added a comment.May 17 2017, 5:08 PM

Overall this looks good.

libexec/rtld-elf/rtld.c
5325 ↗(On Diff #28472)

Excessive () around arglen - 1 expression.

5340 ↗(On Diff #28472)

In fact this stat is not needed, because caller does stat anyway, to check permissions. But the error message at that stat call failure probably needs adjustments when we stating not opened argv0 but fd.

jonathan updated this revision to Diff 28484.May 17 2017, 9:02 PM
jonathan marked 2 inline comments as done.
  • Remove some superfluous parentheses.
  • Remove redundant fstat(2) call.
libexec/rtld-elf/rtld.c
5340 ↗(On Diff #28472)

Good point... how does the new message ("failed to FD <fd> for <path>") look? It can either mean "failed to stat this FD that I opened for <path>" or "failed to stat this FD that you gave me for <path>"... I think maybe it works either way?

kib added inline comments.May 17 2017, 9:10 PM
libexec/rtld-elf/rtld.c
5340 ↗(On Diff #28472)

It might be less efforts to formulate something sound, and less confusing to user, if you create a boolean to track the source of fd, was it opened or obtained as the argument. Set it together with open() call.

Then you can put specific message about fstat. Arguably strerror(errno) should be included into the message.

jonathan added inline comments.May 17 2017, 9:23 PM
libexec/rtld-elf/rtld.c
5340 ↗(On Diff #28472)

Sure, I can do that.

Also, it looks like I've been using tabs for all levels of indent rather than four spaces for second-level indents... I'll need to fix that too.

jonathan updated this revision to Diff 28486.May 17 2017, 9:28 PM
  • Add a missing rtld_strerror()
  • Convert some _rtld_error() calls to rtld_printf().
  • Use four spaces for second-level indent.
  • Be clear about user- vs path-derived descriptor.
jonathan marked 3 inline comments as done.May 17 2017, 9:30 PM
kib accepted this revision.May 17 2017, 9:37 PM

For new functions in rtld, I use normal style(9), so rtld slowly migrates to proper indentation. For smaller changes to existing functions I do follow existing style of 4 spaces indent/2 spaces continuation.

This revision is now accepted and ready to land.May 17 2017, 9:37 PM
In D10751#223329, @kib wrote:

For new functions in rtld, I use normal style(9), so rtld slowly migrates to proper indentation. For smaller changes to existing functions I do follow existing style of 4 spaces indent/2 spaces continuation.

Ah, I see... then I suppose I should put the tabs back before committing.

jonathan updated this revision to Diff 28490.May 17 2017, 10:35 PM
  • Restore tabs to print_args().
This revision now requires review to proceed.May 17 2017, 10:35 PM
This revision was automatically updated to reflect the committed changes.