Fix several issues with process group orphanage.


Fix several issues with process group orphanage.

Attempt of adding assertions that pgrp->pg_jobc counters do not
underflow in r361967, reverted in r362910, points out bugs in the
handling of job control. Peter Holm was able to narrow down the
problem to very easy reproduction with timeout(1) which uses reaping.

The following list of problems with calculation of pg_jobs which
directs SIGHUP/SIGCONT delivery for orphaned process group was

  • Re-calculation of the orphaned status for children of exiting parent was wrong, but mostly unnoticed when all children were reparented to init(8). When child can be reparented to a different process which could affect the child' job control state, it was not properly accounted for in pg_jobc.
  • Lockless check for exiting process' parent process group is racy because nothing prevents the parent from changing its group membership.
  • Exited process is left in the process group, until waited. This affects other calculations of pg_jobc.

Split handling of job control status on process changing its process
group, and process exiting. Calculate increments and decrements for
pg_jobs by exact checking the orphanage instead of assuming process
group membership for children and parent. Move the call to killjobc()
later under the proctree_lock. Mark exiting process in killjobc()
with a new flag P_TREE_GRPEXITED and skip it for all pg_jobc
calculations after the flag is set.

Add checker that independently recalculates pg_jobc value and compares
it with the memoized process group state. This is enabled under INVARIANTS.

Reviewed by: jilles
Discussed with: kevans
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 2 weeks
Differential revision: https://reviews.freebsd.org/D26116


kibAuthored on
Differential Revision
D26116: Fix several issues with process group orphanage.
rS364494: mtree(8): add xref to mtree(5)