Page MenuHomeFreeBSD

proc0_post: Fix some locking issues
ClosedPublic

Authored by bdrewery on Jun 14 2018, 4:50 PM.
Tags
None
Referenced Files
Unknown Object (File)
Feb 9 2024, 6:05 PM
Unknown Object (File)
Feb 2 2024, 11:05 AM
Unknown Object (File)
Jan 21 2024, 4:25 AM
Unknown Object (File)
Jan 17 2024, 10:39 PM
Unknown Object (File)
Dec 22 2023, 11:08 PM
Unknown Object (File)
Nov 14 2023, 7:54 PM
Unknown Object (File)
Nov 5 2023, 4:23 PM
Unknown Object (File)
Oct 28 2023, 9:11 PM
Subscribers

Details

Summary
  • Filter out PRS_NEW procs as rufetch() tries taking the thread lock which may not yet be initialized.
  • Hold PROC_LOCK to ensure stability of iterating the threads.
  • p_rux fields are protected by the process statlock as well.

Sponsored by: Dell EMC

More details

This bug

proc0_post iterates FOREACH_PROC_IN_SYSTEM and then calls rufetch(p) which does FOREACH_THREAD_IN_PROC before taking thread lock.

This page faults of the thread lock is not yet initialized for PRS_NEW procs that are in dofork(). In the case I hit it was calling fdcopy() long before the sched_fork() call to initialize the thread ptrs.

None of this code is holding the PROC_LOCK or PROC_SLOCK.

The typical pattern I've seen for dealing with this is to simply filter out PRS_NEW procs but my proposed patch feels very incomplete as rufetch() still has no care about whether the process is PRS_NEW.

rS275121 changed from using the proc slock around the rufetch() call in proc0_post to the proc statlock. It may be enough to use the slock again in rufetch and filter out PRS_NEW procs there but I haven't analyzed it deeply yet.

more proc0_post issues?

The code is still racy in that microuptime(&p2->p_stats->p_start); is called for new forking processes but proc0_post may come along and trash that with a new value, as well as clearing all of its other stats.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

bdrewery added reviewers: kib, jhb, markj.
sys/kern/init_main.c
626 ↗(On Diff #43771)

I think the process lock scope must be extended to the end of the loop iteration. This would guarantee stability of the iteration on threads.

630 ↗(On Diff #43771)

p_rux fields which are cleared below are annotated as protected by the proc stat lock, so it makes sense to move the unlock right before FOREACH_THREAD... . This is not to protect updating each rux_ field, but to make the whole change atomic WRT other p_rux observers.

636 ↗(On Diff #43771)

Thread lock is useless there, so not taking it is fine.

What is the purpose of this code to begin with? It looks like it should just be removed. If it is needed (what for?), it probably has to run after all initial forking is finished.

The thread list traversal is not protected with any relevant locks.

bdrewery marked 3 inline comments as done.
  • Filter out PRS_NEW procs as rufetch() tries taking the thread lock which may not yet be initialized.
  • Hold PROC_LOCK to ensure stability of iterating the threads.
  • p_rux fields are protected by the process statlock as well.
bdrewery retitled this revision from proc0_post: Filter out new forking procs. to proc0_post: Fix some locking issues.Jun 14 2018, 6:35 PM
bdrewery edited the summary of this revision. (Show Details)
In D15809#334292, @mjg wrote:

What is the purpose of this code to begin with? It looks like it should just be removed. If it is needed (what for?), it probably has to run after all initial forking is finished.

Code makes consistent early processes start time vs rusage.

This revision is now accepted and ready to land.Jun 14 2018, 6:38 PM
This revision was automatically updated to reflect the committed changes.