Changeset View
Standalone View
lib/libc/gen/posix_spawn.c
Show First 20 Lines • Show All 197 Lines • ▼ Show 20 Lines | |||||
do_posix_spawn(pid_t *pid, const char *path, | do_posix_spawn(pid_t *pid, const char *path, | ||||
const posix_spawn_file_actions_t *fa, | const posix_spawn_file_actions_t *fa, | ||||
const posix_spawnattr_t *sa, | const posix_spawnattr_t *sa, | ||||
char * const argv[], char * const envp[], int use_env_path) | char * const argv[], char * const envp[], int use_env_path) | ||||
{ | { | ||||
pid_t p; | pid_t p; | ||||
volatile int error = 0; | volatile int error = 0; | ||||
/* | |||||
* Passing RFSPAWN to rfork(2) gives us effectively a vfork that drops | |||||
* non-ignored signal handlers. We'll fall back to the slightly less | |||||
* ideal vfork(2) if we get an EINVAL from rfork. | |||||
kib: Perhaps explain that this happens on old kernels. | |||||
Done Inline ActionsYou do not need both defines. On amd64, there is so called red zone, an area in the stack below %rsp that compiler is free to use. Red zone size is 128 bytes, which leaves 192 - 128 = 64 bytes to use until formal overflow. IMO it is too optimistic. kib: You do not need both defines.
On amd64, there is so called red zone, an area in the stack… | |||||
Done Inline ActionsAh, that's good to know- 192 was a rough approximation based on measured-once... do_posix_spawn -> process_spawnattr -> (syscalls)/sigismember do_posix_spawn -> process_file_actions -> process_file_actions_entry -> (syscalls) So accounting for 3/4-deep, perhaps 768 as a conservative estimate? Even 1k wouldn't exactly hurt anything, that's 'small potatoes' as far as posix_spawn memory requirements go, I would think. kevans: Ah, that's good to know- 192 was a rough approximation based on measured-once... | |||||
Not Done Inline ActionsVarious exported functions such as _execve() are called, which might cause rtld and/or libthr to be involved and stack usage to go up greatly. In principle, it's possible to do the necessary work with only unexported aliases and/or do preparations like getenv("PATH") in the parent process. jilles: Various exported functions such as _execve() are called, which might cause rtld and/or libthr… | |||||
Done Inline ActionsHmm... at least most of this should be successfully avoiding libthr, as it's "in an undefined state after the vfork" according to it. What specifically would you propose here? Switching _execve over to the unwrapped version and using an even more conservative 1k/2k/4k for the stack size to catch any edge cases? kevans: Hmm... at least most of this should be successfully avoiding libthr, as it's "in an undefined… | |||||
Done Inline ActionsAlternatively, how bad of an idea would it be to fetch the stack pointer, subtract past the red zone, and use that for the rfork_thread stack? This is only x86 that has this problem, so it's really tempting. kevans: Alternatively, how bad of an idea would it be to fetch the stack pointer, subtract past the red… | |||||
Not Done Inline ActionsI would go forward with 2K or even 4K stack and claim that the issue is resolved. kib: I would go forward with 2K or even 4K stack and claim that the issue is resolved. | |||||
Not Done Inline ActionsWhat could be a problem is if the child is killed forcibly while holding a lock, since this would likely cause the parent to deadlock later. This could happen a signal is sent to the process group for which the parent has a handler that does little and the default action is to terminate the process. (Getting this to happen with SIGKILL would be very unlikely since it is hard to send a signal to the child process only.) With rtld, this can happen only with the first posix_spawn call using particular functionality in the process, and libthr seems to be successfully avoided. This seems rare enough not to block this review on it. jilles: What could be a problem is if the child is killed forcibly while holding a lock, since this… | |||||
*/ | |||||
p = rfork(RFSPAWN); | |||||
jillesUnsubmitted Done Inline ActionsI'm surprised that this works at all on (among others) amd64: the child pops the return address and overwrites it with new return addresses for calls it makes; how can the parent's return (the ret instruction in rfork.S from SYS.h) go to the right place? Note that vfork has a hand-written stub that deals with this problem. On architectures such as sparc64, arm and mips where subroutine return addresses are stored in a register and system call stubs do not use the stack, this should not be an issue. jilles: I'm surprised that this works at all on (among others) amd64: the child pops the return address… | |||||
kibUnsubmitted Done Inline ActionsIndeed, perhaps rfork_thread(3) should be used. I hope that it still works. kib: Indeed, perhaps rfork_thread(3) should be used. I hope that it still works. | |||||
kevansAuthorUnsubmitted Done Inline ActionsI'll double check my test system and rework this kevans: I'll double check my test system and rework this | |||||
jillesUnsubmitted Done Inline ActionsUsing rfork_thread(3) to implement something vfork-like does not seem a violation of its deprecation. The deprecation is about using rfork_thread(3) to create threads. rfork_thread(3)'s API is deficient in specifying the stack: it only takes an initial stack pointer, where I would expect a base and a size. Perhaps because of that or because it is deprecated, it is only available on i386 and amd64. Using a separate stack for the child process does avoid GCC bugs like https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21161. jilles: Using rfork_thread(3) to implement something vfork-like does not seem a violation of its… | |||||
kibUnsubmitted Done Inline ActionsBy 'hope' I mean that we do not corrupt userspace registers in the syscall for rfork_thread() to work. Note that fast syscall on amd64 does not restore all registers as an optimization. Fortunately, child return from fork sets PCB_FULL_IRET in cpu_fork(), so all registers are restored. From my reading of the rfork_thread() code, it does not matter for parent. But I did not verified it by a test. kib: By 'hope' I mean that we do not corrupt userspace registers in the syscall for rfork_thread()… | |||||
jillesUnsubmitted Done Inline Actionsrfork_thread(RFPROC | RFFDG | RFMEM, ...) appears to work still (on stable/11 amd64 and head amd64). rfork_thread.S only depends on %rsi being preserved in the child, but vfork.S depends on %rsi being preserved in both parent and child. The fast return path of fast_syscall additionally preserves %rdi and %rsi, calling that a bonus in comments. I don't think it is a bonus -- vfork.S depends on it and has no good alternative location to store the parent's return address (thread local storage? yuck). jilles: rfork_thread(RFPROC | RFFDG | RFMEM, ...) appears to work still (on stable/11 amd64 and head… | |||||
kibUnsubmitted Done Inline ActionsFrom my reading, the child dependencies are %rsi, %r12, and %rbx. But they are handler by iret path. kib: From my reading, the child dependencies are %rsi, %r12, and %rbx. But they are handler by iret… | |||||
if (p == -1 && errno == EINVAL) | |||||
p = vfork(); | p = vfork(); | ||||
Not Done Inline ActionsIn a threaded process, this will call libthr's __thr_sigprocmask which leaves SIGCANCEL/SIGTHR unmasked. Perhaps this should be __sigprocmask which is an unexported weak alias for __sys_sigprocmask. jilles: In a threaded process, this will call libthr's `__thr_sigprocmask` which leaves… | |||||
Done Inline ActionsIt might be wise to fall back to vfork() on EINVAL. kib: It might be wise to fall back to vfork() on EINVAL. | |||||
switch (p) { | switch (p) { | ||||
case -1: | case -1: | ||||
return (errno); | return (errno); | ||||
case 0: | case 0: | ||||
if (sa != NULL) { | if (sa != NULL) { | ||||
error = process_spawnattr(*sa); | error = process_spawnattr(*sa); | ||||
if (error) | if (error) | ||||
_exit(127); | _exit(127); | ||||
} | } | ||||
if (fa != NULL) { | if (fa != NULL) { | ||||
error = process_file_actions(*fa); | error = process_file_actions(*fa); | ||||
if (error) | if (error) | ||||
_exit(127); | _exit(127); | ||||
} | } | ||||
if (use_env_path) | if (use_env_path) | ||||
_execvpe(path, argv, envp != NULL ? envp : environ); | _execvpe(path, argv, envp != NULL ? envp : environ); | ||||
else | else | ||||
_execve(path, argv, envp != NULL ? envp : environ); | _execve(path, argv, envp != NULL ? envp : environ); | ||||
error = errno; | error = errno; | ||||
_exit(127); | _exit(127); | ||||
default: | default: | ||||
if (error != 0) | if (error != 0) | ||||
Done Inline ActionsDo you mean "must not return"? jilles: Do you mean "must not return"? | |||||
Done Inline ActionsHmmm... I caught that once, but it looks like I had only fixed it locally in the diff that I uploaded (61840)... fixed for next update. kevans: Hmmm... I caught that once, but it looks like I had only fixed it locally in the diff that I… | |||||
_waitpid(p, NULL, WNOHANG); | _waitpid(p, NULL, WNOHANG); | ||||
else if (pid != NULL) | else if (pid != NULL) | ||||
*pid = p; | *pid = p; | ||||
return (error); | return (error); | ||||
} | } | ||||
} | } | ||||
int | int | ||||
▲ Show 20 Lines • Show All 247 Lines • Show Last 20 Lines |
Perhaps explain that this happens on old kernels.