Page MenuHomeFreeBSD

libc: report _SC_NPROCESSORS_ONLN more accurately in cpu-limited jails
AcceptedPublic

Authored by kevans on Aug 30 2025, 6:54 PM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Oct 8, 8:59 PM
Unknown Object (File)
Sun, Oct 5, 3:08 AM
Unknown Object (File)
Wed, Oct 1, 10:20 AM
Unknown Object (File)
Sat, Sep 27, 10:35 PM
Unknown Object (File)
Fri, Sep 26, 8:23 PM
Unknown Object (File)
Fri, Sep 26, 6:16 PM
Unknown Object (File)
Thu, Sep 25, 6:33 AM
Unknown Object (File)
Mon, Sep 22, 5:31 AM
Subscribers

Details

Reviewers
kib
Group Reviewers
Jails
manpages
Summary

We don't support CPU hotplug, but we do support cpuset(8) restrictions
on jails (including prison0, which uses cpuset 1). The process cannot
widen its cpuset beyond its root set, so it makes sense to instead
report the number of cpus enabled there rather than the total number
in the system.

This change is effectively a nop for the majority of systems and jails
in the wild, though it does reduce the performance of this query now
that we can't take advantage of AT_NCPUS being provided in the auxinfo.

Add a _SC_NPROCESSORS_ONLN_GLOBAL_NP for applications that actually do
want whole-system information instead of their own constraints.

The implementation here is notably different than Linux, which would not
take cgroups into account. They do, however, take CPU hotplug into
account, so the possibility for it to diverge from (and be lower than)
the # configured count to reflect what the process can actually be
scheduled on doesn't really diverge in semantics.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 66728
Build 63611: arc lint + arc unit

Event Timeline

include/unistd.h
293 ↗(On Diff #161273)

It might be smart to rename this one to SC_NPROCESSORS_ONLN_GLOBAL_NP and allocate a new number to new semantic of _SC_NPROCESSOR_ONLN, but it might be against your intentions.

lib/libc/gen/sysconf.c
600

Hm since we have _CONF, I now tend to think that GLOBAL_NP is not needed at all. Sorry for making you do unneeded work.

include/unistd.h
293 ↗(On Diff #161273)

I don't think I care about forcing the behavior on applications that haven't rebuilt, this was just an oversight. I'll flip them as suggested.

lib/libc/gen/sysconf.c
600

After you proposed it, I was thinking that you'd ultimately end up with the following distinction:

_SC_NPROCESSORS_CONF -> AT_NCPUS / hw.ncpu, # cpus on the system
_SC_NPROCESSORS_ONLN_GLOBAL_NP -> Enabled in prison0's cpuset
_SC_NPROCESSORS_ONLN -> Current jail's cpuset

GLOBAL_NP is notably not implemented that way in this patch because you'd need to expose that through another sysctl, which is more than I'd like to do this close to branching (and it will almost always be the same value for most systems deployed today, except for one that I'm aware of where I had broken their setup in the past with cpuset changes).

CONF as implemented covers those CPUs also available for kernel threads, while the two ONLN would specifically only include those cores accessible to userland (modulo system root adjusting prison0 cpuset back up to include all cores). I think arbitrary processes really just want the new definition of ONLN, but I could see some uses of ONLN_GLOBAL_NP for more vendor-y applications that may integrate across jails if they want to treat it as a soft limit / hard limit kind of thing.

lib/libc/gen/sysconf.c
600

Ok.

Then probably remove _NP from the new name, we do not do it for other bsd-specific confs.

BTW, this probably adds one more syscall to the libthr initialization path. Eventually we would want to tame it back by adding new AT_ auxv.

lib/libc/gen/sysconf.c
600

WRT the last paragraph: no, libthr uses _SC_NPROCESSORS_CONF to initialize _thr_is_smp. Perhaps this should be changed to _SC_NPROCESSORS_ONLN?

lib/libc/gen/sysconf.c
600

Looking at its usage there, I think we want libthr to continue using _SC_NPROCESSORS_CONF, since process affinity can change after initialization. The usage of _thr_is_smp looks like it's just an optimization for UP systems, so I think we want it to lean towards a super conservative estimate of whether we *could* run SMP to avoid running into correctness issues if the situation changes (however unlikely that may be).

Revise based on feedback and further consideration

GLOBAL_NP is removed for the time being since CONF already represents all of the
CPUs in a system. I've been considering a world where one can create a jail and
specify a cpuset.parent that is visible to the parent to inherit for the jail's
root. This would be practically useful for cases like PR 253724, where the
reporter jumps through a hoop with an init_script to get all of prison0 onto a
separate cpuset from child jails -- this would be easier if they could instead
create a jail with cpuset.parent=0 and restrict it down from the system set
after limiting prison0 down to the set they have in their init_script. Doing so
shouldn't raise any security issues, as the parent still has to have access to
the new root's parent.

In a world like that, there's no defined relationship between prison0 and any
other jail, so it would be quite arbitrary.

This revision is now accepted and ready to land.Sep 1 2025, 4:29 PM