Page MenuHomeFreeBSD

kern: disallow user scheduling/debugging/signalling of jailed procs
ClosedPublic

Authored by kevans on Thu, Jul 31, 2:16 AM.
Tags
None
Referenced Files
F125570263: D51645.id.diff
Sat, Aug 9, 9:27 AM
Unknown Object (File)
Fri, Aug 8, 5:29 AM
Unknown Object (File)
Thu, Aug 7, 7:23 PM
Unknown Object (File)
Wed, Aug 6, 7:29 AM
Unknown Object (File)
Wed, Aug 6, 7:29 AM
Unknown Object (File)
Mon, Aug 4, 4:52 PM
Unknown Object (File)
Mon, Aug 4, 4:50 PM
Unknown Object (File)
Mon, Aug 4, 2:59 PM
Subscribers

Details

Summary

Currently, jails are generally ignored when determining whether the
current process/thread can take action upon another, except to determine
if the target's jail is somewhere in the source's hierarchy. Notably,
uid 1001 in a jail (including prison0) can take action upon a process
run by uid 1001 inside of a subordinate jail by default.

While this could be considered a feature at times, it is a scenario
that really should be deliberately crafted; there is no guarantee that
uid 1001 in the parent jail is at all related to uid 1001 in a
subordinate.

This changes introduces three new privileges that grant a process
this kind of insight into other jails:

  • PRIV_DEBUG_DIFFJAIL
  • PRIV_SCHED_DIFFJAIl
  • PRIV_SIGNAL_DIFFJAIL

These can be granted independently or in conjunction with the
accompanying *_DIFFCRED privileges, i.e.:

  • PRIV_DEBUG_DIFFCRED alone will let uid 1001 debug uid 1002, but PRIV_DEBUG_DIFFJAIL is additionally needed to let it debug uid 1002 in a jail.
  • PRIV_DEBUG_DIFFJAIL alone will let uid 1001 debug uid 1001 in a jail, but will not allow it to debug uid 1002 in a jail.

Note that security.bsd.see_jail_proc can be used for similar effects,
but does not prevent a user from learning the pid of a jailed process
with matching creds and signalling it or rescheduling it (e.g., cpuset).
Debugging is restricted by visibility in all cases, so that one is less
of a concern.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 65963
Build 62846: arc lint + arc unit

Event Timeline

Note that this has been accidentally chilling in a local branch for about four years, but at a brief glance over more recent changes I didn't see anything that might remedy this.

Would privileges actually work, I have no objections. But right now this change makes the very useful feature (at least for me), only available to root. I do often use 'jail -u <me> / something 127.0.0.1 /bin/sh', and have the jailed processes only bound to localhost, otherwise they are normal (can be debugged etc).

It would be pity to loose the ability. Can we have at least a knob to re-enable the current behavior?

In D51645#1179733, @kib wrote:

Would privileges actually work, I have no objections. But right now this change makes the very useful feature (at least for me), only available to root. I do often use 'jail -u <me> / something 127.0.0.1 /bin/sh', and have the jailed processes only bound to localhost, otherwise they are normal (can be debugged etc).

It would be pity to loose the ability. Can we have at least a knob to re-enable the current behavior?

I'd be happy to add a knob- I'd like to get some input from Jails folks on whether to default enable or disable, but my feeling here is that most jail deployments these days treat non-system users as distinct between jails and system.

Note that security.bsd.see_jail_proc can be used for similar effects,
but does not prevent a user from learning the pid of a jailed process
with matching creds and signalling it or rescheduling it (e.g., cpuset).

Well, that hasn't been true since commit 5817169bc4a0 ("Fix 'security.bsd.see_jail_proc' by using cr_bsd_visible()"; 2023/09/28), whose purpose was precisely to fix the flaw that, even if you couldn't see sub-jail processes because of security.bsd.see_jail_proc, you could attempt to blindly signal them or change their scheduling parameters (by guessing the PID or obtaining it through other means). (In passing, I'll note that security.bsd.see_jail_proc is a misnomer, it should have been something like security.bsd.see_sub_jail_proc.)

The change here has the value of forbidding to act on processes with same user ID in strict sub-jails, as by default these are expected to be independent, if seeing these has not been disabled with security.bsd.see_sub_jail_proc. I support enabling that new restriction by default, not so much based on the most probable scenarios but rather on security grounds, as the ability to be able to interact with a user in a sub-jail could be quite surprising to people that are not aware of our jails' details. Then, a knob is needed to support kib@'s case, and potentially others leveraging the user ID conflation that people have come up with.

sys/kern/kern_prot.c
2105–2110

I would consider factoring out this code in some separate function, e.g., called something like can_tamper_with_sub_jail(). prison_check() seems a priori less appealing, at it is more about "pure" jail hierarchy checks, but that might be acceptable.

(Not repeating this for other occurences of this check.)

Note that security.bsd.see_jail_proc can be used for similar effects,
but does not prevent a user from learning the pid of a jailed process
with matching creds and signalling it or rescheduling it (e.g., cpuset).

Well, that hasn't been true since commit 5817169bc4a0 ("Fix 'security.bsd.see_jail_proc' by using cr_bsd_visible()"; 2023/09/28), whose purpose was precisely to fix the flaw that, even if you couldn't see sub-jail processes because of security.bsd.see_jail_proc, you could attempt to blindly signal them or change their scheduling parameters (by guessing the PID or obtaining it through other means). (In passing, I'll note that security.bsd.see_jail_proc is a misnomer, it should have been something like security.bsd.see_sub_jail_proc.)

I had kind of suspected this was the case, but hadn't yet found time to verify -- thanks, will revise.

sys/kern/kern_prot.c
2105–2110

Yeah, with the introduction of a knob I was planning to put into a function (whose name I couldn't really decide on); I think based on some discussion out-of-band with others, it probably makes sense as a per-jail allow knob rather than a whole system policy (and based on your proposed naming + thinking about it a bit, I think one allow flag to cover all three cases is fine vs. trying to do three different knobs. I think the three different priv(9) are still fine to be able to write fine-grained MAC policies in case one's needs are more complex than the default policies allow.

jamie added a subscriber: jamie.

I like this. Given that a non-root user isn't allowed to mess with a jail, it makes that it wouldn't be allowed to mess with processes in that jail, even if they happen to have the same uid. That sounds like preferable default behavior, even if it's a switch from current practice. The closer we get to namespaces like uid being conceptually a (jail, id) tuple, the better off we are.

In D51645#1179733, @kib wrote:

Would privileges actually work, I have no objections. But right now this change makes the very useful feature (at least for me), only available to root. I do often use 'jail -u <me> / something 127.0.0.1 /bin/sh', and have the jailed processes only bound to localhost, otherwise they are normal (can be debugged etc).

It would be pity to loose the ability. Can we have at least a knob to re-enable the current behavior?

An idea we've been kicking around is non-root jails, tied to the cred of the user that created them. That's definitely for later, given that not a line of code has been written and security will have to be fully thought through. But it's for just the kind of thing you're doing here.

In the meantime (and long-term), a knob makes sense - preferably in the form of a jail allow.* parameter rather than a global sysctl.

jamie requested changes to this revision.Thu, Jul 31, 4:37 PM

In the meantime (and long-term), a knob makes sense.

So yeah, I was a little hasty to accept just yet.

This revision now requires changes to proceed.Thu, Jul 31, 4:37 PM

Add some knobs:

  • per-jail sysctl security.bsd.unprivileged_subjail_tampering
  • jail param allow.{,no}unpriviled_subjail_tampering

The new knob defaults to OFF to avoid surprises for common user, but may be
flipped back on by developer(s) or whomever else wants it. Note that the
allow flag would not inherit from the parent's setting today when a new jail is
created, which seems like a reasonable idea.

sys/kern/kern_priv.c
271 ↗(On Diff #159516)

I just realized that this is a check on the tempering process (and thus the tampering jail). I was envisioning a tag on a prison saying "this prison's non-root processes are open to tampering by non-root unjailed processes." I understand that's reversed from the meaning of the other allow.* bits, bit it still gits in the "allow" umbrella albeit in a passive way.

I see like in kib's example making a jail saying "I plan to tamper with this one" more than I see making a jail saying "this one should be allowed to tamper with its subjails."

The problem I suppose is this just isn't the way we check inter-process interactions. The typical route is to make sure the actor is strong enough, while it makes more sense to me in this case to see if the victim is weak enough.

sys/kern/kern_priv.c
271 ↗(On Diff #159516)

Hmm, I don't think I have any objection. I hadn't considered that kind of model, admittedly- I saw kib's example and decided that it'd be good to have something to slap in sysctl.conf and forget about it (but I guess you could do the same with jail.conf). I'll hold a bit on implementing it in case anyone else has a stronger opinion.

Flip the sense of things and use an allow.unprivileged_parent_tampering knob

This still defaults to no longer allowing it, but it's framed from the
perspective of what the jail will allow rather than the parent. This is
probably more useful so that some jails could be configured to allow it while
others not, rather than accepting that the parent jail is configured to allow
its users to tamper with subjails. This simplfiies the implementation in some
ways.

Document the new knob in jail(8), too

sys/sys/jail.h
272

With the redefinition of the flag, I would expect PR_ALLOW_PRISON0 not to have PR_ALLOW_UNPRIV_PARENT_TAMPER.

sys/sys/jail.h
272

I was somewhat torn here, because there's no parent of prison0 to tamper with it; its presence (or lack thereof) is kind of just cosmetic. Thinking about it a bit more, though, that's probably a good argument to just drop it to reflect the reality (that there exists no possibility that a parent's users can tamper with its processes).

Remove PR_ALLOW_UNPRIV_PARENT_TAMPER from prison0; there is no parent jail to
do said tampering, but this accurately reflects the situation that a parent
cannot tamper with its user processes.

True, there's no real need for the PR_ALLOW_PRISON0 change, but since you went to the effort to make that macro anyway, it's a good place to showcase it.

This revision is now accepted and ready to land.Thu, Aug 7, 5:19 AM

True, there's no real need for the PR_ALLOW_PRISON0 change, but since you went to the effort to make that macro anyway, it's a good place to showcase it.

Yeah, fair enough- though I also don't have any objection dropping that change entirely. OTOH, I think there's some benefit still in keeping the distinction between static flags and prison0 defaults clear in case we do add some allow flag that we wouldn't want on prison0 by default.

I'm planning to commit this within the next day or so, with the following note added to the commit message:

For development setups that involve regularly debugging jailed processes from outside the jail, consider adding `allow.unprivileged_parent_tampering;` to /etc/jail.conf.

My main concern here is heading off complaints from folks that just create ad-hoc jails with jail(8) and don't really use /etc/jail.conf otherwise. They probably already realize they can configure global parameters like this, but maybe not.