Page MenuHomeFreeBSD

jail: add process iterator
ClosedPublic

Authored by mjg on Mar 10 2022, 8:34 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Apr 20, 9:04 AM
Unknown Object (File)
Sat, Apr 20, 9:03 AM
Unknown Object (File)
Sat, Apr 20, 9:03 AM
Unknown Object (File)
Sat, Apr 20, 9:03 AM
Unknown Object (File)
Jan 30 2024, 8:28 PM
Unknown Object (File)
Jan 30 2024, 8:28 PM
Unknown Object (File)
Jan 30 2024, 8:28 PM
Unknown Object (File)
Jan 30 2024, 8:28 PM
Subscribers

Details

Summary

Linkage is added for jails for the iterator to use, but it is only utilized when there are no children jails at given level, for simplicity.

This largely sorts out excessive process traversals when running poudriere, which does "kill -9 -1" in jails for each package.

commit 08d130e9cc370c06faec62505cc7bfdd8bfca90b (HEAD -> jaillink1-1)
Author: Mateusz Guzik <mjg@FreeBSD.org>
Date:   Thu Mar 10 19:58:12 2022 +0100

    signal: use proc_iterate to save on work
    
    Most notably poudriere performs kill -9 -1 in jails for each port
    being built. This reduces the scan from hundrends of processes to
    literally 1.
    
    Reviewed by:
    Differential Revision:

commit 21fe78c1b0216820142388f1c85eee6d4a9f8e6f
Author: Mateusz Guzik <mjg@FreeBSD.org>
Date:   Thu Mar 10 19:42:37 2022 +0100

    jail: add process linkage
    
    It allows iteration over processes belonging to given jail instead of
    having to walk the entire allproc list.
    
    Note the iteration can miss processes which remains bug-compatible
    with previous code.
    
    Reviewed by:
    Differential Revision:

Sample stats below:

processes iterated over:

  value  ------------- Distribution ------------- count
    512 |                                         0 
    520 |                                         1 
    528 |                                         5 
    536 |                                         14
    544 |@                                        26 
    552 |@                                        39
    560 |@                                        31 
    568 |@                                        39
    576 |@                                        41 
    584 |@                                        39
    592 |@                                        37
    600 |@                                        30       
    608 |                                         14
    616 |                                         12
    624 |                                         8        
    632 |                                         9
    640 |                                         17
    648 |                                         21
    656 |@                                        36
    664 |@                                        59   
    672 |@                                        52
    680 |@@                                       82
    688 |@                                        69
    696 |@@                                       96
    704 |@@                                       103
    712 |@@                                       89
    720 |@@                                       111
    728 |@@                                       93
    736 |@@                                       105
    744 |@@                                       90
    752 |@@                                       90
    760 |@@                                       94
    768 |@                                        50
    776 |@                                        56
    784 |@                                        45
    792 |@                                        33
    800 |@                                        29
    808 |                                         22
    816 |                                         17
    824 |                                         17
    832 |                                         18
    840 |                                         12
    848 |                                         13
    856 |                                         10
    864 |                                         20 
    872 |                                         14
    880 |                                         24 
    888 |                                         12
    896 |                                         5  
    904 |                                         6 
    912 |                                         4 
    920 |                                         2 
    928 |                                         3 
    936 |                                         1 
    944 |                                         2 
    952 |                                         2 
    960 |                                         0 
    968 |                                         3 
    976 |                                         2 
    984 |                                         3 
    992 |                                         2 
   1000 |                                         0 
   1008 |                                         3 
   1016 |                                         1 
>= 1024 |                                         0

processes signalled:

value  ------------- Distribution ------------- count
   -1 |                                         0
    0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1983
    1 |                                         0

with the patch:

value  ------------- Distribution ------------- count    
    0 |                                         0        
    1 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1454     
    2 |                                         0        


value  ------------- Distribution ------------- count    
   -1 |                                         0        
    0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1454     
    1 |                                         0

Note this patch does not fully solve the problem outlined by me in D28566 -- should something want to kill everything in the jail without paying for entering it, there is still no way to do it.

Test Plan

poudriere

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

mjg requested review of this revision.Mar 10 2022, 8:34 PM
mjg edited the summary of this revision. (Show Details)
sys/kern/kern_jail.c
2989

Since allproc_lock isn't locked here, there's a possibility of a race that would let a newly created process slip through.

sys/sys/jail.h
180

(a) is allprison_lock, but this is protected by allproc_lock.

sys/kern/kern_jail.c
2989

they can already slip through due to PRS_NEW.

I wrote this with the assumption that the fork vs jail destruction races I mentioned elsewhere remain unfixed, but now that you mentioned it I found cc7b73065302005ebc4a19503188c8d6d5eb923d . I find that solution questionable and it runs into extra difficulties here.

I think the right way to do it is to implement a per-cpu lock a'la rm lock for forking in the jail. Fork would do:

rms_rlock(prison->fork);
if (sigispending()) { /* check if we got killed */
    rms_runlock(prison->fork);
    return error;
}
rms_runlock(prison->fork);

this would acct as a reliable barrier where no new processes get created while something wlocks the lock and any in-flight process creation will get aborted if the creating process got killed, which in particular would happen with the above loop if it starts taking the prison fork lock.

sys/kern/kern_jail.c
2989

i mean runlock would happen after the child transitions out of PRS_NEW

sys/kern/kern_jail.c
2986

Why bother asserting this?

2988

Does anything prevent a child jail from being created concurrently? If so then the expected state should be asserted.

sys/sys/jail.h
365

This macro seems a bit dubious since it's not clear how it handles processes belonging to child jails. I wouldn't trust arbitrary consumers to use it correctly and it's currently unused outside of kern_jail.c - can it be private to kern_jail.c?

mjg updated this revision to Diff 109828.
mjg edited the summary of this revision. (Show Details)
mjg added inline comments.
sys/kern/kern_jail.c
2989

So I had a look again, I'm confident this is bug-compatible with existing code, except has less work to do when running from within a jail.

Properly fixing all fork races turns out to be kind of cumbersome and I may get to do it some time later.

mjg marked 2 inline comments as done.Aug 25 2022, 1:18 PM
mjg marked an inline comment as done.
This revision is now accepted and ready to land.Aug 31 2022, 7:43 PM
markj added inline comments.
sys/kern/kern_jail.c
3003

"fall through to a system-wide search"

3010

IMO it would be better to check p_state under the proc. I don't think this is avoiding many lock/unlock operations, and some of the removed code checked p_state without this race. The fewer data races we have, the easier it is to reason about and modify the code.

This revision was automatically updated to reflect the committed changes.