Details

Reviewers

des
markj

Commits

rG0e67c3f675ea: reap_kill_subtree_once(): reap_kill_proc_work() might drop proctree_lock
rGbe140717a0a4: procctl(PROC_REAP_KILL): use pgrp pg_killsx sx to sync with fork

Summary

PROC_REAP_KILL must guarantee that all reaper descendants are signalled.
In particular, it must ensure that forked but not yet fully linked
descendants cannot escape killing. Currently, proc_reap() fullfils the
guarantee by single-threading stopping the target process, which moves
the target to the userspace boundary, so the target cannot fork while
the signal is sent.

Single-threading has undesirable effect of sometimes terminating sleeps
with EINTR.

Since the time that the bug with PROC_REAP_KILL was fixed, we grow
the pg_killsx mechanism that is similarly used by the process group
signalling to ensure that no member of the process group escapes.
Reuse pg_killsx for PROC_REAP_KILL as well.

Besides the functional change of no longer causing spurious EINTR, not
single-threading the target means that we no longer need to delegate the
work to the taskqueue.

PR:     290844


reap_kill_subtree_once(): reap_kill_proc_work() might drop proctree_lock

Due to this, restart the iteration over the p_reapsiblings if the lock
was dropped.

Diff Detail

Repository

rG FreeBSD src repository

Lint

Lint Not Applicable

Unit

Tests Not Applicable

Event Timeline

kib created this revision.Sun, Feb 15, 11:16 AM

Herald added subscribers: olce, imp. · View Herald TranscriptSun, Feb 15, 11:16 AM

kib requested review of this revision.Sun, Feb 15, 11:16 AM

kib added a subscriber: pho.Sun, Feb 15, 12:28 PM

Simplify more: no need to single-thread itself, pg_killsx should be enough there as well.

I have tested both versions of the patch and am no longer able to provoke a spurious EINTR.

Without the patch, I could provoke one almost instantly by just holding down Ctrl-T while running a program that does nothing but wait for input on stdin:

% time timeout 60 ./r
load: 72.65  cmd: r 90089 [ttyin] 0.26r 0.00u 0.00s 0% 920k
load: 72.65  cmd: r 90089 [ttyin] 0.75r 0.00u 0.00s 0% 920k
load: 72.65  cmd: r 90089 [ttyin] 0.77r 0.00u 0.00s 0% 920k
load: 72.65  cmd: timeout 90088 [runnable] 0.77r 0.00u 0.00s 0% 2412k
load: 72.65  cmd: r 90089 [ttyin] 0.79r 0.00u 0.00s 0% 920k
load: 72.65  cmd: timeout 90088 [runnable] 0.79r 0.00u 0.00s 0% 2412k
r: stdin: Interrupted system call
timeout 60 ./r  0.00s user 0.00s system 0% cpu 0.794 total

This revision is now accepted and ready to land.Mon, Feb 16, 1:32 PM

kib added a reviewer: markj.Mon, Feb 16, 1:44 PM

pg_killsx is before proctree_lock, handle this.
Add a fix for a pre-existing problem of unsafe iteration over p_reapsiblings list.

This revision now requires review to proceed.Mon, Feb 16, 3:24 PM

Restore useful comment.

markj added inline comments.Tue, Feb 17, 2:38 PM

sys/kern/kern_procctl.c
262–263	Why is it called `error1` instead of `error`?
271	The procctl manual says, "The rk_fpid field is set to the pid of the first process for which signal delivery failed, e.g., due to permission problems", but here we only set it upon ESRCH. Shouldn't it be set for any error from cr_cansignal()?
283	Can we call it `reap_kill_proc()` instead? The `_work` suffix made more sense when this was a deferred task.
287	If I understand correctly, we check for P2_WEXIT because it's not possible to single-thread an exiting process. But now that we're not single-threading the target anymore, does it matter if P2_WEXIT is set?

kib marked 4 inline comments as done.Tue, Feb 17, 2:51 PM

kib added inline comments.

sys/kern/kern_procctl.c
271	I believe it is intentional, For instance, the visibility of the process from other uid, or another jail, would be compromised if we returned an error from cr_cansignal (which is different from ESRCH in these cases).
287	It gives a nice optimization of not waiting for the exiting process to single-thread, and avoids taking the process group killpg lock. Also it potentially reduces the scope for the proctree_lock. I left it there because of that.

s/error1/error/
Drop _work() suffix.

markj added inline comments.Tue, Feb 17, 2:54 PM

sys/kern/kern_procctl.c
287	The assertions in reap_kill_proc() and reap_kill_proc_locked() make it look like this is required for correctness, but if it's just an optimization then IMO the assertions should be removed and comment should explain why we check for P2_WEXIT.

Add comments about P_WEXIT2 and ESRCH.

markj added inline comments.Tue, Feb 17, 3:25 PM

sys/kern/kern_procctl.c
271	I misread before, here we are testing `w->error == ESRCH`, not `error == ESRCH`. `w->error` is initialized to ESRCH in reap_kill(), so here we are indeed returning the PID of the first process that is not visible. However, as soon as some process is successfully signaled, we set `*w->error = 0`, so we report `rk_fpid` only if no process was successfully signaled so far.
290

kib marked 2 inline comments as done.Tue, Feb 17, 3:58 PM

kib added inline comments.

sys/kern/kern_procctl.c
271	I think that the reason I mentioned in the (already dropped) comment is worth considering.

Grammar.
Drop ESRCH comment.

markj accepted this revision.Tue, Feb 17, 4:17 PM

markj added inline comments.

sys/kern/kern_procctl.c
271	So, cr_bsd_visible() returns ESRCH if the target process is not visible. That way, plain `kill(<random PID>, 0)` cannot be used to distinguish between unused and "invisible" PIDs. Here, is it sufficient to ignore ESRCH from cr_cansignal()? That is, only update `rk_fpid` if 1) no error was reported already, i.e., `*w->error == ESRCH`, and 2) `error != ESRCH`.

This revision is now accepted and ready to land.Tue, Feb 17, 4:17 PM

kib marked an inline comment as done.Wed, Feb 18, 8:27 AM

kib added inline comments.

sys/kern/kern_procctl.c
271	D55341

Closed by commit rGbe140717a0a4: procctl(PROC_REAP_KILL): use pgrp pg_killsx sx to sync with fork (authored by kib). · Explain WhySat, Feb 21, 10:08 AM

This revision was automatically updated to reflect the committed changes.

kib marked an inline comment as done.

kib added a commit: rGbe140717a0a4: procctl(PROC_REAP_KILL): use pgrp pg_killsx sx to sync with fork.

kib added a commit: rG0e67c3f675ea: reap_kill_subtree_once(): reap_kill_proc_work() might drop proctree_lock.

procctl(PROC_REAP_KILL): use pgrp pg_killsx sx to sync with fork
ClosedPublic
Actions

Details

Diff Detail

Event Timeline

Revision Contents
Changeset List

Diff 172372

sys/kern/kern_procctl.c

procctl(PROC_REAP_KILL): use pgrp pg_killsx sx to sync with forkClosedPublicActions

Details

Diff Detail

Event Timeline

Revision ContentsChangeset List

Diff 172372

sys/kern/kern_procctl.c

procctl(PROC_REAP_KILL): use pgrp pg_killsx sx to sync with fork
ClosedPublic
Actions

Revision Contents
Changeset List