Page MenuHomeFreeBSD

Set of fixes for REAP_KILL_SUBTREE
ClosedPublic

Authored by kib on May 24 2022, 9:14 PM.
Tags
None
Referenced Files
F103634376: D35310.id106405.diff
Wed, Nov 27, 10:43 AM
Unknown Object (File)
Sun, Nov 24, 12:52 PM
Unknown Object (File)
Mon, Nov 18, 8:30 PM
Unknown Object (File)
Mon, Nov 18, 8:21 PM
Unknown Object (File)
Mon, Nov 18, 6:47 PM
Unknown Object (File)
Mon, Nov 4, 4:32 PM
Unknown Object (File)
Mon, Nov 4, 12:30 PM
Unknown Object (File)
Oct 27 2024, 6:50 PM
Subscribers

Details

Summary

which were found by more Peter Holm' tests and syzcaller. Around half of it is really fixes for thread_single(MODE_ALLPROC) revealed after use in PROC_REAP_KILL.

Per-commit view is available at https://kib.kiev.ua/git/gitweb.cgi?p=deviant3.git;a=shortlog;h=refs/heads/reap

Below is the git log

commit c4866e78d24571fc056e7244865ff245eea0ecc0
Author: Konstantin Belousov <kib@FreeBSD.org>
Date:   Tue May 17 02:47:20 2022 +0300

    reap_kill_proc(): avoid singlethreading any other process if we are exiting
    
    This is racy because curproc process lock is not used, but allows the
    process to exit faster.  It is userspace issue to create such race
    anyway, and not fullfilling the guarantee that all reaper descendants
    are signalled should be fine.
    
    In collaboration with:  pho
    Sponsored by:   The FreeBSD Foundation
    MFC after:      2 weeks

commit f28453bc15d24777ad98b3971cb504596b4a9958
Author: Konstantin Belousov <kib@FreeBSD.org>
Date:   Mon May 16 01:52:09 2022 +0300

    reap_kill_subtree(): hold the reaper when entering it into the queue to handle later
    
    We drop proctree_lock, which allows the process to exit while memoized
    in the list to proceed.
    
    Reported by:    pho
    Sponsored by:   The FreeBSD Foundation
    MFC after:      2 weeks

commit d4b4dad7a0d54747eecc0754aa1c8482f2b49471
Author: Konstantin Belousov <kib@FreeBSD.org>
Date:   Tue May 10 00:41:23 2022 +0300

    reap_kill_subtree_once(): handle proctree_lock unlock in reap_kill_proc()
    
    Recorded reaper might loose its reaper status, so we should not assert
    it, but check and avoid signalling if this happens.
    
    Reported by:    pho
    Sponsored by:   The FreeBSD Foundation
    MFC after:      2 week

commit 21beeb0b991cd0d4df74911d3d5e9795fd21446f
Author: Konstantin Belousov <kib@FreeBSD.org>
Date:   Tue May 3 22:49:37 2022 +0300

    reap_kill_proc: do not retry on thread_single() failure
    
    The failure means that the process does single-threading itself, which
    makes our action not needed.
    
    Reported by:    pho
    Sponsored by:   The FreeBSD Foundation
    MFC after:      2 weeks

commit cb736e0ab0226ec30667f1d722385c7210a46267
Author: Konstantin Belousov <kib@FreeBSD.org>
Date:   Sun May 1 02:29:25 2022 +0300

    Make stop_all_proc_block interruptible to avoid deadlock with parallel suspension
    
    If we try to single-thread a process which thread entered
    procctl(REAP_KILL_SUBTREE), and sleeping waiting for us unlocking
    stop_all_proc_blocker, we must be able to finish single-threading.  This
    requires the sleep to be interruptible.
    
    Reported by:    pho
    Sponsored by:   The FreeBSD Foundation
    MFC after:      2 weeks

commit be3ba48d127222080da8e8e2538936ce61e7e695
Author: Konstantin Belousov <kib@FreeBSD.org>
Date:   Thu May 12 14:47:40 2022 +0300

    thread_single(): remove already checked conditional expression
    
    Tested by:      pho
    Sponsored by:   The FreeBSD Foundation
    MFC after:      2 weeks

commit 33c8f199c340c30e194d20a28885bb446fc91917
Author: Konstantin Belousov <kib@FreeBSD.org>
Date:   Mon May 16 00:55:32 2022 +0300

    Do not single-thread itself when the process single-threaded some another process
    
    Since both self single-threading and remote single-threading rely on
    suspending the thread doing thread_single(), it cannot be mixed: thread
    doing thread_suspend_switch() might be subject to thread_suspend_one()
    and vice versa.
    
    In collaboration with:  pho
    Sponsored by:   The FreeBSD Foundation
    MFC after:      2 weeks

commit ce9be75f8c31136d93784faa8b81754596e7c966
Author: Konstantin Belousov <kib@FreeBSD.org>
Date:   Wed May 11 02:37:58 2022 +0300

    weed_inhib(): correct the condition to re-suspend a thread
    
    suspended for SINGLE_ALLPROC mode.  There is no need to check for
    boundary state.  It is only required to see that the suspension comes
    from the ALLPROC mode.
    
    In collaboration with:  pho
    Sponsored by:   The FreeBSD Foundation
    MFC after:      2 weeks

commit ff86aefa6bf49b8958707aa209746dbdf00bb4a6
Author: Konstantin Belousov <kib@FreeBSD.org>
Date:   Sun May 1 02:30:13 2022 +0300

    weed_inhib(): do not double-suspend already suspended thread if the loop reiterates
    
    In collaboration with:  pho
    Sponsored by:   The FreeBSD Foundation
    MFC after:      2 weeks

commit f34b993c950e824ff1c7872a4174c60c625a01b8
Author: Konstantin Belousov <kib@FreeBSD.org>
Date:   Tue May 3 22:48:50 2022 +0300

    thread_single: wait for P_STOPPED_SINGLE to pass
    
    to avoid ALLPROC mode to try to race with any other single-threading
    mode.
    
    In collaboration with:  pho
    Sponsored by:   The FreeBSD Foundation
    MFC after:      2 weeks

commit ab92a6f353fdca4741d67ac12baa29e7a2876464
Author: Konstantin Belousov <kib@FreeBSD.org>
Date:   Thu May 5 02:57:26 2022 +0300

    P2_WEXIT: avoid thread_single() for exiting process earlier
    
    before the process itself does thread_single(SINGLE_EXIT).  We cannot
    single-thread such process in ALLPROC (external) mode, and properly
    detect and report the failure to do so due to the process becoming
    zombie is easier to prevent than handle.
    
    In collaboration with:  pho
    Sponsored by:   The FreeBSD Foundation
    MFC after:      2 weeks

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

kib requested review of this revision.May 24 2022, 9:14 PM
sys/kern/kern_exit.c
222

Is PCATCH correct here? Upon receiving a signal we'll spin in the loop until p_singlethr goes to zero.

sys/kern/kern_procctl.c
295

P_STOPPED contains P_STOPPED_SINGLE. So if the target proc is already stopped by, e.g., thread_single(SINGLE_BOUNDARY) in fork(), then here we will not stop the proc, and will deliver a signal. This may unsuspend the target and violate assumptions of the stopping thread.

Hmm, but this problem is not specific to reap_kill. Any kill(2) call could have the same effect...

At least, I suspect that need_stop should be true if P_STOPPED_SINGLE is set in the target.

sys/kern/kern_thread.c
1147–1149

The comment still references TDF_BOUNDARY.

1214

It's not obvious to me that you intend to catch signals here either.

sys/sys/proc.h
694

Looks like it should be (c).

853

?

sys/kern/kern_exit.c
222

We need to be able to suspend there. Would we have something PSUSP instead of PCATCH, it be better indeed.

kib marked 6 inline comments as done.May 26 2022, 7:41 PM
kib added inline comments.
sys/kern/kern_exit.c
222

I surprised myself by being able to generate an another approach. If P2_WEXIT is set, then there is no sense trying to deliver a signal, since it cannot reach userspace. Then issignal() should start ignoring queued signals after P2_WEXIT.

Not sure about other side effects, but lets try?

sys/kern/kern_procctl.c
295

I do not see how would we unsuspend the target by delivering a signal. In fact, if we do, this is bug, as you noted, unrelated to this fragment. Imagine that the signal was delivered while forking thread is already in kernel but before thread_single(). Signal must not reach userspace then.

That said, the main point there is if P_STOPPED_SINGLE is set, we *do not* need to single-thread because the process lock is not dropped and process is in the right state.

sys/kern/kern_thread.c
1214

See above.

kib marked 2 inline comments as done.

Added

issignal(): ignore signals when process is single-threading for exit
sys/kern/kern_procctl.c
295

tdsendsignal() may unsuspend a target thread. I think there is at least one bug there: suppose thread T is suspended in thread_single(SINGLE_ALLPROC) when trying to kill another process with REAP_KILL. Suppose a different thread sends SIGKILL to T->td_proc. Then, tdsendsignal() calls thread_unsuspend(T, T->td_proc). thread_unsuspend() incorrectly decrements T->td_proc->p_suspcount to -1.

Later, when T->td_proc exits, it will wait forever in thread_single(SINGLE_EXIT) since T->td_proc->p_suspcount never reaches 1.

392
sys/kern/kern_sig.c
2942

syzkaller finally (after several days) found a useful reproducer which seems to trigger a bug related to the tdsendsignal() behaviour I mentioned above. Here it is: https://reviews.freebsd.org/P542

kib marked 2 inline comments as done.Jun 8 2022, 12:54 AM
kib marked an inline comment as done.Jun 8 2022, 1:32 AM
kib added inline comments.
sys/kern/kern_procctl.c
295

I agree. And I think that thread_unsuspend_one() can be skipped for T then.

kib marked an inline comment as done.

thread_unsuspend(): do not unuspend the suspended leader thread doing SINGLE_ALLPROC

syzkaller finally (after several days) found a useful reproducer which seems to trigger a bug related to the tdsendsignal() behaviour I mentioned above. Here it is: https://reviews.freebsd.org/P542

The problem there appears to be related to inconsistent manipulation of p_boundary_count. thread_suspend_switch() will generally increment p_boundary_count and set TDF_BOUNDRY, even if the stop reason is not SINGLE_BOUNDARY. So if REAP_KILL stops the target proc with SINGLE_ALLPROC, the target proc's p_boundary_count can end up being non-zero, but then thread_single_end() only decrements p_boundary_count if mode == SINGLE_BOUNDARY. This leaves the target process in an inconsistent state.

This change fixes it for me:

39 @@ -1677,7 +1679,7 @@ thread_single_end(struct proc *p, int mode)  
40                         thread_lock(td);                       
41                         if (TD_IS_SUSPENDED(td)) {              
42                                 wakeup_swapper |= thread_unsuspend_one(td, p,
43 -                                   mode == SINGLE_BOUNDARY);
44 +                                   true);
45                         } else
46                                 thread_unlock(td);
47                 }

syzkaller finally (after several days) found a useful reproducer which seems to trigger a bug related to the tdsendsignal() behaviour I mentioned above. Here it is: https://reviews.freebsd.org/P542

The problem there appears to be related to inconsistent manipulation of p_boundary_count. thread_suspend_switch() will generally increment p_boundary_count and set TDF_BOUNDRY, even if the stop reason is not SINGLE_BOUNDARY. So if REAP_KILL stops the target proc with SINGLE_ALLPROC, the target proc's p_boundary_count can end up being non-zero, but then thread_single_end() only decrements p_boundary_count if mode == SINGLE_BOUNDARY. This leaves the target process in an inconsistent state.

This change fixes it for me:

39 @@ -1677,7 +1679,7 @@ thread_single_end(struct proc *p, int mode)  
40                         thread_lock(td);                       
41                         if (TD_IS_SUSPENDED(td)) {              
42                                 wakeup_swapper |= thread_unsuspend_one(td, p,
43 -                                   mode == SINGLE_BOUNDARY);
44 +                                   true);
45                         } else
46                                 thread_unlock(td);
47                 }

You are right, of course. There is even an assert that single_end() ends with the right boundary count.

But I do not see much sense in trying to maintain correct boundary count for ALLPROC, so I force it to zero in single_end() for now.

In D35310#803739, @kib wrote:

syzkaller finally (after several days) found a useful reproducer which seems to trigger a bug related to the tdsendsignal() behaviour I mentioned above. Here it is: https://reviews.freebsd.org/P542

The problem there appears to be related to inconsistent manipulation of p_boundary_count. thread_suspend_switch() will generally increment p_boundary_count and set TDF_BOUNDRY, even if the stop reason is not SINGLE_BOUNDARY. So if REAP_KILL stops the target proc with SINGLE_ALLPROC, the target proc's p_boundary_count can end up being non-zero, but then thread_single_end() only decrements p_boundary_count if mode == SINGLE_BOUNDARY. This leaves the target process in an inconsistent state.

This change fixes it for me:

39 @@ -1677,7 +1679,7 @@ thread_single_end(struct proc *p, int mode)  
40                         thread_lock(td);                       
41                         if (TD_IS_SUSPENDED(td)) {              
42                                 wakeup_swapper |= thread_unsuspend_one(td, p,
43 -                                   mode == SINGLE_BOUNDARY);
44 +                                   true);
45                         } else
46                                 thread_unlock(td);
47                 }

You are right, of course. There is even an assert that single_end() ends with the right boundary count.

But I do not see much sense in trying to maintain correct boundary count for ALLPROC, so I force it to zero in single_end() for now.

So I changed my view, and just added your patch to the queue with the author set to markj. Thank you.

Consistently maintain p_boundary_count for ALLPROC mode.
Fix thread_single() in rfork, which also needs to wait for ALLPROC singlethreading counter to pass.

This revision is now accepted and ready to land.Jun 9 2022, 12:40 PM