Page MenuHomeFreeBSD

Add pdrfork(2) and pdwait(2)
AcceptedPublic

Authored by kib on Thu, Jan 8, 2:17 AM.
Tags
None
Referenced Files
F142649664: D54592.id170123.diff
Wed, Jan 21, 8:55 PM
F142635391: D54592.id170193.diff
Wed, Jan 21, 5:24 PM
F142616713: D54592.id170089.diff
Wed, Jan 21, 1:13 PM
F142577535: D54592.id170167.diff
Wed, Jan 21, 5:15 AM
F142576937: D54592.id170128.diff
Wed, Jan 21, 5:06 AM
F142568824: D54592.id170167.diff
Wed, Jan 21, 2:58 AM
F142565350: D54592.diff
Wed, Jan 21, 1:56 AM
F142551182: D54592.diff
Tue, Jan 20, 9:54 PM

Details

Reviewers
asomers
brooks
emaste
markj
Group Reviewers
capsicum
Summary
kern/kern_exit.c: some style


kern/kern_fork/exit.c: organize includes

Remove sys/cdefs.h.
Remove sys/param.h.
Order the sys/*.h includes alphabetically.


kern/kern_fork.c: define the exterror category for fork

Convert EINVALs in kern_fork.c into EXTERRORs.


sys: add AUE_PDRFORK


contrib/openbsm: add AUE_PDRFORK


sys: add pdrfork(2)


lib/libsys, lib/libc: export pdrfork(2)


kern/kern_exit.c: define the exterror category for exit/wait

Convert EINVALs in kern_exit.c into EXTERRORs.


kern/kern_exit.c: extract some helpers from proc_to_reap()


kern/kern_exit.c: extract wait6_check_alive() helper


Add pdwait(2)


lib/libsys, lib/libc: export pdwait


Regen for the fork and exit/wait exterror category addition


Regen syscall tables after pdfork(2) and pdwait(2) additions

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Add cap_pdwait_rights.
Make WNOWAIT functional for pdwait(), or rather make defaults to reap the process on pdwait().

Suggested by: asomers

I had to build an aarch64 VM just to test the RFSPAWN change, but it works now. It probably works on amd64 too, but I can't test it without writing assembly code.

As for the pdwait() reaping behavior, I don't have a strong opinion either way. I suggested that pdwait(WEXITED) should reap the process just like wait4() for consistency's sake.

As for the RFPROCDESC behavior, I think that your proposal of using pdrfork(RFPROCESC) to modify the current process is very unlikely to be useful. I would prefer that to return EINVAL. OTOH, anybody who misuses pdrfork(RFPROCDESC) will currently get EBADF, which isn't so wrong. So you're way should be ok.

And I've found another problem. Currently, combining pdrfork() with waitpid(WEXITED) will reap the process, and make the pid available for reuse. I thought otherwise last Thursday only because my testing was incomplete. This limitation makes the Rust feature impossible to implement bug-free, at least until pdwait() is available. Would it be possible to reserve the pid even after the process has been reaped, until the process descriptor is closed? That needn't be part of this review, of course, but I mention it in case that affects the requirements of pdwait().

sys/kern/kern_fork.c
222

How does one even get a process descriptor for the current process? The only way I know of is to use pdfork(), then have the parent send the process descriptor to the child via sendmsg. All that just so the child can modify its own process, using pdrfork instead of rfork? That doesn't seem useful to me.

sys/sys/procdesc.h
132
kib marked 2 inline comments as done.Thu, Jan 15, 5:30 AM

I had to build an aarch64 VM just to test the RFSPAWN change, but it works now. It probably works on amd64 too, but I can't test it without writing assembly code.

I planned to work on posix_spawn() after this series is landed. Thank you for the testing.

As for the pdwait() reaping behavior, I don't have a strong opinion either way. I suggested that pdwait(WEXITED) should reap the process just like wait4() for consistency's sake.

I already changed the behavior as you suggested.

As for the RFPROCDESC behavior, I think that your proposal of using pdrfork(RFPROCESC) to modify the current process is very unlikely to be useful. I would prefer that to return EINVAL. OTOH, anybody who misuses pdrfork(RFPROCDESC) will currently get EBADF, which isn't so wrong. So you're way should be ok.

I agree, I dropped this chunk.

And I've found another problem. Currently, combining pdrfork() with waitpid(WEXITED) will reap the process, and make the pid available for reuse. I thought otherwise last Thursday only because my testing was incomplete. This limitation makes the Rust feature impossible to implement bug-free, at least until pdwait() is available. Would it be possible to reserve the pid even after the process has been reaped, until the process descriptor is closed? That needn't be part of this review, of course, but I mention it in case that affects the requirements of pdwait().

So right now pdwait() reaps and this frees the pid. I wrote (untested) change to only free pid on free of procdesc (i.e. the last close, after reap).

sys/kern/kern_fork.c
222

Ok, I dropped this stuff.

kib marked an inline comment as done.

Fix pdwait() userspace prototype.
Drop not useful operations on curproc procdesc.
Postpone freeing the zombie' pid until procdesc is freed.

Make free_pid automatic, remote the proc_reap(free_pid) arg

kib edited the summary of this revision. (Show Details)

Drop the patch to sort exterr_cat_filenames.h

I'm afraid that with this latest patch, I get an instapanic on boot in pid 1.

panic: bit 1442 not set in 0

cpuid = 0
time = 1768845658
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00d8008920
vpanic() at vpanic+0x136/frame 0xfffffe00d8008a50
panic() at panic+0x43/frame 0xfffffe00d8008ab0
proc_id_clear() at proc_id_clear+0xa6/frame 0xfffffe00d8008ae0
proc_reap() at proc_reap+0x445/frame 0xfffffe00d8008b20
proc_to_reap() at proc_to_reap+0x363/frame 0xfffffe00d8008b70
kern_wait6() at kern_wait6+0x1df/frame 0xfffffe00d8008c10
sys_wait4() at sys_wait4+0x6b/frame 0xfffffe00d8008e00
amd64_syscall() at amd64_syscall+0x169/frame 0xfffffe00d8008f30
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfffffe00d8008f30
--- syscall (7, FreeBSD ELF64, wait4), rip = 0x2d804a, rsp = 0x820548518, rbp = 0x820548580 ---
KDB: enter: panic
sys/bsm/audit_kevents.h
667

This conflicts with AUE_SPECIALFD .

kib marked an inline comment as done.

Renumber AUE_PDRFORK.
Drop contrib/openbsm patch, similar to how other new AUE_ constants are not imported there.

Do not release procdesc->pd_pid if the descriptor was already detached from the struct proc. In this case, the pid is freed by reaping.

This looks great, @kib! If you commit it, I'll follow up with the test suite. A few questions:

  • Are you planning to write the man page yourself, or would you like help with that?
  • Even if you don't MFC the new syscalls, would it be possible to MFC the pid recycling change?
This revision is now accepted and ready to land.Tue, Jan 20, 1:12 AM
kib added reviewers: emaste, markj.

Add documentation.
Fix pdwait() prototype.

This revision now requires review to proceed.Tue, Jan 20, 3:41 AM

This looks great, @kib! If you commit it, I'll follow up with the test suite. A few questions:

  • Are you planning to write the man page yourself, or would you like help with that?

I prototyped the initial version of the docs. Feel free to suggest improvements.

  • Even if you don't MFC the new syscalls, would it be possible to MFC the pid recycling change?

I plan to MFC everything. It depends on how the live testing in HEAD goes, of course.

Add note about the pid reuse.

asomers requested changes to this revision.Tue, Jan 20, 1:21 PM

Thanks for writing the man page too. I've made a few comments, mostly to make the English more grammatical and idiomatic.

lib/libsys/pdfork.2
33

Don't forget to adjust .Dd too

104

More idiomatic English.

112–114

number agreement

117
118
136–137
140
179–180
239

Or perhaps sooner, if MFCd? I'll set a reminder for myself to update the man page before 15.1, if you've MFCd by then.

This revision now requires changes to proceed.Tue, Jan 20, 1:21 PM
lib/libsys/pdfork.2
80
lib/libsys/pdfork.2
102

I think this sentence was OK before, but was a bit long/wordy, tried to tighten it up slightly here. @asomers does this change sound fine?

112–114

Mine would be "for a description of the possible rfflag flags."

lib/libsys/pdfork.2
102

Sure, except you're missing a space in "betweenthe".

kib marked 13 inline comments as done.Tue, Jan 20, 2:33 PM
kib added inline comments.
lib/libsys/pdfork.2
239

Yes, it is only a safest estimation.

kib marked an inline comment as done.

Grammar and wording of the man page update.

asomers requested changes to this revision.Tue, Jan 20, 2:48 PM
asomers added inline comments.
lib/libsys/pdfork.2
182–183

Missing macro dots

This revision now requires changes to proceed.Tue, Jan 20, 2:48 PM
kib marked an inline comment as done.

Correct macro names.

This revision is now accepted and ready to land.Tue, Jan 20, 3:16 PM
sys/kern/kern_exit.c
1322

Why "%jx"? options is an int.

1452

Is it formally correct to dereference p here? AFAICS there is nothing preventing it from being recycled after the proc lock is dropped.

1559

Why do we check p_state == PRS_ZOMBIE twice? I do not see how it can change: if wait6_check_alive() returns false, the proc lock will not have been dropped.

1573

I think this path is missing fdrop()?

kib marked 4 inline comments as done.Wed, Jan 21, 12:31 AM
kib added inline comments.
sys/kern/kern_exit.c
1322

The format is utilized by usermode, which always casts the arguments to uintmax_t: we do not know the real types of the values passed by the kernel, we only know that they are integer.

I noted it in the exterror(9) man page that the 'j' modifier must always be applied.

1452

Formally yes because struct proc are type-stable. Of course there is a possible race with the 'p' reuse, whatever unlikely it is. I changed this to use the local variable pid that already caches the pid, same as in line 1432.

1559

Yes, I think this if(){} block can be removed. This is a remnant of earlier, different, structure of the pdwait() loop.

kib marked 3 inline comments as done.

Avoid racy access to the p_pid member of possibly reclaimed struct proc.
Other review' comments are handled.

This revision now requires review to proceed.Wed, Jan 21, 12:32 AM
markj added inline comments.
lib/libsys/pdfork.2
115

Either the "the" should be removed, or you should write "The .Fn pdrfork system call".

136
sys/compat/freebsd32/freebsd32_misc.c
303

BTW, I suspect freebsd32_rusage_out() should do some zeroing. struct timeval32 will have a hole on non-amd64 platforms, so we are copying out uninitialized memory there.

sys/kern/kern_exit.c
1452

Right, I was referring to the race.

sys/kern/subr_capability.c
93

rights.4 should be updated to mention the new CAP_PDWAIT.

sys/kern/sys_procdesc.c
317

Do we need to wake up the kern_pdwait() loop here too? I guess not, since the pdwait() caller must be holding a reference on the procdesc, so PDF_CLOSED cannot be set?

This revision is now accepted and ready to land.Wed, Jan 21, 1:54 PM
kib marked 6 inline comments as done.Wed, Jan 21, 2:20 PM
kib added inline comments.
lib/libsys/pdfork.2
136

It is function as well, and might become a wrapper some day.
Then we need to edit the man page, this is why I slightly prefer the function term.
But ok.

sys/kern/sys_procdesc.c
317

If PDF_CLOSED set, then indeed no process might have the file installed into the descriptors table.

lib/libsys/pdfork.2
136

Ok, it is just a suggestion for consistency, e.g., above you wrote "The pdrfork() system call".

sys/kern/sys_procdesc.c
317

I would maybe add a KASSERT that PDF_CLOSED is not set in the kern_pdwait() loop, but it is just a suggestion.

kib marked 4 inline comments as done.

Assert that PDF_CLOSED is not set for fd operated by pdwait().
Make pdwait() cancellable.

This revision now requires review to proceed.Wed, Jan 21, 3:34 PM
This revision is now accepted and ready to land.Wed, Jan 21, 4:37 PM