Page MenuHomeFreeBSD

[RFC] kern: mac: add various jail MAC hooks
ClosedPublic

Authored by kevans on Nov 27 2025, 6:37 PM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, Jan 20, 12:37 PM
Unknown Object (File)
Sun, Jan 18, 6:04 AM
Unknown Object (File)
Fri, Jan 16, 2:05 PM
Unknown Object (File)
Wed, Jan 14, 1:15 PM
Unknown Object (File)
Wed, Jan 14, 11:45 AM
Unknown Object (File)
Wed, Jan 14, 11:42 AM
Unknown Object (File)
Wed, Jan 14, 9:04 AM
Unknown Object (File)
Tue, Jan 6, 6:36 AM

Details

Summary

This adds the following hooks:

  • mpo_prison_check_attach: check for subject capability to attach to a given jail
  • mpo_prison_check_create: check for subject capability to create a jail with the given option set
  • mpo_prison_check_get: check for subject capability to fetch the given parameters for a jail
  • mpo_prison_check_set: check for subject capability to set the given parameters for a jail
  • mpo_prison_check_remove: check for subject capability to remove the jail

get wouldn't typically be a privileged operation, but is included
to give MAC policies a wider range of capabilities at a relatively low
cost. We also add two more for the purpose of label propagation:

  • mpo_prison_created: surface the creation of a jail so that one can do propagation to, e.g., the root vnode or any mounts
  • mpo_prison_attached: attach an existing process to the jail so that one can propagate the jail label to the process, as appropriate.

It is unclear if this is preferred vs. having separate associate entry
points for each type of object we might associate. That would split
these up like so:

  • prison_created -> prison_associate_vnode
  • prison_attached -> prison_associate_proc

Some sample policy ideas that should be feasible to implement with this
set of hooks, in case it's inspiring:

  • mac_bomb: policy that allows a poudriere user to construct jails without root privilege, given a restricted set of jail parameters. Slap a warning label on it.
  • mac_capsule: policy that realizes the capsule idea that I pitched[0] on -jail@ to create jails that are effectively immutable once sealed, using these hooks and a label.

Perhaps a silly idea, but a downstream could consider a scenario where
it can implement special jail enumeration using a MAC policy and a
cooperating application that specifies non-parameter options to filter
the results.

[0] https://lists.freebsd.org/archives/freebsd-jail/2025-September/000550.html

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 69393
Build 66276: arc lint + arc unit

Event Timeline

jamie added inline comments.
sys/kern/kern_jail.c
2615

I don't really see the point in this one. It stops efficient jail listing, but you could always just try a jail_get on all million JIDs in the normal range.

If the MAC policies must be able to hide jails, then this case should be completely indistinguishable from the "the jail doesn't exist" one, which means returning EINVAL and the exact same error message, which varies depending on the code point.

That's 4 new different calls to mac_prison_check_get(). Could you please try to move these out into a single occurrence after the found_prison label, along with the test on !(prison_isalive(pr) || (flags & JAIL_DYING)) (because that one must be after it)? That would make the latter redundant in the jailid case, but that's really nothing compared to the gain in clarity. You would also have to introduce some variable to hold the error message corresponding to the "the jail doesn't exist" case, since it's the one to be returned when mac_prison_check_get() returns some special error value indicating hiding (EINVAL?).

All hooks are using _NOSLEEP() variants, except for mac_prison_created and mac_prison_attached. Starting with something restrictive is fine (if sufficient for your use cases). In some cases, only the allprison_lock sx is held, so the corresponding hooks could sleep (although not necessarily recommended with sx). In any case, it's reasonable to keep mac_prison_check_*() checks as non-sleepable. That mac_prison_created and mac_prison_attached can sleep is fine with the current jail code (the former is called only with allprison_lock, the second without locks).

"Needs changes" set for the missing unlocks, see inline comments.

sys/kern/kern_jail.c
1868

It seems restrictive to check for attach so in advance of the actual attach. Modules might want to decide upon the value of jail parameters, which are not set at this stage. And having the check near the actual attach makes things clearer IMHO, even if that means that this function will have modified some parameters in the meantime. Also, I'm not too sure why we wouldn't do the attach check also in the jail creation case, which moving the check near the attach also "solves". What do you think?

2584

Please see herald comment about mac_prison_check_get().

2615

That's true, since any other "get" methods do not have that mac_prison_check_list() check.

2942–2943

Unlock of allprison_lock and pr missing.

2969–2973

I'd move the crfree() close to the last use of jdcred. Or equivalently the new block one line below.

3018–3032

Maybe factor out the two occurences of error handling code that unlocks the prison and the tree sx, e.g., under a new unlock-tagged block at the function's end?

olce requested changes to this revision.Dec 2 2025, 4:41 PM

For the missing unlocks.

This revision now requires changes to proceed.Dec 2 2025, 4:41 PM
sys/kern/kern_jail.c
1868

This is similar to my checks of attach permission, as early as possible in various paths. From a user perspective, the ideal is for a failed call to do nothing at all. While I can't achieve that, it makes sense to catch as many errors as possible before they have an effect.

kevans added inline comments.
sys/kern/kern_jail.c
1868

Yeah, pretty much that - I didn't want any of the set operation to succeed if MAC was ultimately going to deny the operation to avoid putting the jail into a weird unexpected state if one exercises the hook to block it. However, I think it's a valid point that the attach check is a trivial TOCTOU issue since the policy can't take any options that might be changing into consideration, so it's mostly useful for policies that want to make that decision based on labels.

Missing it for the creation case was an oversight, yeah.

2969–2973

whoops, yeah, I missed that in the shuffle- originally I had another MAC entry point to authorize the use of a jail descriptor, but csjp pointed out that that was quite silly when the operations themselves are already checked.

kevans added inline comments.
sys/kern/kern_jail.c
2584

Hmm, I think the only reason I didn't after removing the jaildesc checks is because the messages frr the dying case are sensitive to whether they were doing a lookup by jid vs. name. One could just use a "$name ($jid)" forrmat to provide both sets of information, though I'm not sure how much we value the status quo.

kevans added inline comments.
sys/kern/kern_jail.c
2584

My local WIP moves it and renders pr_name if pr_name[0] isn't a digit that isn't 0 (since that's just the jid-name).

kevans marked 3 inline comments as done.

Highlights:

  • Removed prison_check_list, because it's of limited value; this could pose a problem if check_get rejects a prison as lastjid enumeration would halt when MAC denied it, but that probably requires a different solution
  • Moved mac_prison_check_get calls to after found_prison, but on the tail of that last note, it might make sense to provide a found_prison_nomac and move one check back into the lastjid loop and just continue if we encounter one that we can't fetch. This would be cleaner than policies having to implement both a check_list and check_get to avoid breaking enumeration
  • Moved the attach check to the point that we're about to attach; there are other ways this can fail from external modules anyways (e.g., osd), so trying to keep it all atomic may not be worth as much as being sure that the policy can examine the current properties of the jail. This also ensures that the creation+attach case is covered
  • Add missing unlocking

Ping

Sorry, was ill the past days and couldn't get back to it. Will do soon.

Coming back to highlights, there's indeed still the problem that enumeration returns an error if the last jail is denied via mac_prison_check_get(), so doing what you suggest is probably mandatory: Put back the MAC check in the loop, and once a jail is returned, jump to a prison_found_nomac label, even a _nomac_noalive variant as to eliminate also the (harmless) redundant test.

Additionally, as mentioned before, there's that, if mac_prison_check_get() denies access, we return a VFS error saying "access denied by MAC" which would be an info leak for situations where the jail must remain hidden. E.g., we could have mac_prison_check_get() return EPERM for a visible deny and ENOENT for an invisible one, and then treat them differently with respect to vfs_opterror(). But it seems that goes farther than what you're envisioning at this point (I haven't checked further revisions in the series thoroughly yet, to see if they need something like this).

sys/kern/kern_jail.c
2272–2281

Redundant check of flags & JAIL_ATTACH.

Coming back to highlights, there's indeed still the problem that enumeration returns an error if the last jail is denied via mac_prison_check_get(), so doing what you suggest is probably mandatory: Put back the MAC check in the loop, and once a jail is returned, jump to a prison_found_nomac label, even a _nomac_noalive variant as to eliminate also the (harmless) redundant test.

Additionally, as mentioned before, there's that, if mac_prison_check_get() denies access, we return a VFS error saying "access denied by MAC" which would be an info leak for situations where the jail must remain hidden. E.g., we could have mac_prison_check_get() return EPERM for a visible deny and ENOENT for an invisible one, and then treat them differently with respect to vfs_opterror(). But it seems that goes farther than what you're envisioning at this point (I haven't checked further revisions in the series thoroughly yet, to see if they need something like this).

I'm not too worried about that, though it does occur to me that it probably makes sense to stop setting an error in the opts for any of those checks. The MAC policy should probably have free reign over both errno and error probably in case it's some kind of custom vendor integration.

olce requested changes to this revision.Dec 16 2025, 2:32 PM

Compile errors because of leftovers of some prison_copy_label() operation.

Attached patch fixes them.

sys/security/mac_stub/mac_stub.c
1915

prison_copy_label() op does not exist.

sys/security/mac_test/mac_test.c
1620–1630

prison_copy_label() op does not exist.

3360

prison_copy_label() op does not exist.

This revision now requires changes to proceed.Dec 16 2025, 2:32 PM

I'm not too worried about that, though it does occur to me that it probably makes sense to stop setting an error in the opts for any of those checks. The MAC policy should probably have free reign over both errno and error probably in case it's some kind of custom vendor integration.

I agree, we should probably stop setting the err opt. But that doesn't solve the discoverability of whether MAC (and not, e.g., real nonexistence of a jail) was the reason some access was denied, as in non-MAC we still set "errmsg" in the VFS options.

Giving this control to MAC modules would be nice; that requires obviously changing the interface of the hooks (if you do it, please don't pass struct vfsoptlist all around, but instead some function pointer to a printf-like function, so that we can clean up the use of struct vfsoptlist in jails at some point).

I'm not too worried about that, though it does occur to me that it probably makes sense to stop setting an error in the opts for any of those checks. The MAC policy should probably have free reign over both errno and error probably in case it's some kind of custom vendor integration.

I agree, we should probably stop setting the err opt. But that doesn't solve the discoverability of whether MAC (and not, e.g., real nonexistence of a jail) was the reason some access was denied, as in non-MAC we still set "errmsg" in the VFS options.

Giving this control to MAC modules would be nice; that requires obviously changing the interface of the hooks (if you do it, please don't pass struct vfsoptlist all around, but instead some function pointer to a printf-like function, so that we can clean up the use of struct vfsoptlist in jails at some point).

I'm not sure I understand this last bit- I'm already passing the struct vfsoptlist all around so that MAC modules can reject jail_[sg]et operations based on the parameters they want to fetch or set. Are you implicitly suggesting that I should proactively give them a callback instead to work with the option list instead? I wasn't planning on limiting what they can do with the option list at all (e.g., if they wanted to override a parameter that'is to be set), so I guess I'd pass around an opaque pointer instead and provide sme KBI to fetch/set jail opts (that just defer to the vfsoptlist equivalents for now).

I'm not sure I understand this last bit

Oh... It's just a pilot error on my part, sorry!

I'm already passing the struct vfsoptlist all around so that MAC modules can reject jail_[sg]et operations based on the parameters they want to fetch or set.

Don't know how I managed to forget that for a while.

I just wanted to limit propagation of struct vfsoptlist elsewhere, but passing it makes sense for some hooks, so nevermind.

I'm not sure I understand this last bit

Oh... It's just a pilot error on my part, sorry!

I'm already passing the struct vfsoptlist all around so that MAC modules can reject jail_[sg]et operations based on the parameters they want to fetch or set.

Don't know how I managed to forget that for a while.

I just wanted to limit propagation of struct vfsoptlist elsewhere, but passing it makes sense for some hooks, so nevermind.

Ok, thanks- to be clear, I'm still not against the idea of providing an abstraction layer to avoid some future headache for MAC modules when the core implementation changes. Did you have some sketch of what you're thinking in the area, by chance?

kevans marked 4 inline comments as done.

Highlights:

  • Remove vfs_opterror() for those entry points that take the opts already
  • Move one case of mac_prison_check_get back as a special-case to avoid breaking jail enumeration.
  • Unbreak the build of this patch: prison_copy_label comes in a later change
  • Drop redundant JAIL_ATTACH check
sys/security/mac_test/mac_test.c
1620–1630

This is actually just misplaced; I've moved it into D53956, which is the first commit that introduces and uses the operation.

Ok, thanks- to be clear, I'm still not against the idea of providing an abstraction layer to avoid some future headache for MAC modules when the core implementation changes. Did you have some sketch of what you're thinking in the area, by chance?

I don't have any code, just the idea to disentangle jail options and vfs options, but still making them share mostly the same code as is the case today. I don't remember the specifics unfortunately, but I seem to recall we are applying some no processing to jail options unduly, even after a .; that processing in the case of VFS mount options also could do with a big cleanup (we pass the option list around, but also have some redundant flags, and it's unclear when one or the other is used; also, some options without no currently don't make sense IIRC, such as atime, so you'd have to pass nonoatime to cancel some previous noatime, etc.; I forgot what we are doing exactly in this area). The disentanglement would lead to changing all references to struct vfsoptlist when dealing with jails to something like struct jailoptlist, that's why I wanted to limit propagation. That said, passing the jail options to the hooks seems a good idea for flexibility (perhaps this is too flexible, and we should prevent modification of passed options; I don't feel I have enough perspective to be definitive on that topic).

This revision is now accepted and ready to land.Dec 20 2025, 6:42 PM