Page MenuHomeFreeBSD

jail: add allow.routing jail permission
ClosedPublic

Authored by ivy on Apr 15 2025, 5:57 PM.
Tags
None
Referenced Files
Unknown Object (File)
Thu, Oct 23, 4:54 PM
Unknown Object (File)
Thu, Oct 16, 10:52 AM
Unknown Object (File)
Thu, Oct 9, 5:36 PM
Unknown Object (File)
Fri, Oct 3, 1:20 AM
Unknown Object (File)
Thu, Oct 2, 7:18 AM
Unknown Object (File)
Thu, Oct 2, 1:36 AM
Unknown Object (File)
Tue, Sep 30, 8:49 AM
Unknown Object (File)
Sep 16 2025, 9:46 AM

Details

Summary

if allow.routing is set, the jail can modify the system routing table even if
it's not a VNET jail.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

ivy requested review of this revision.Apr 15 2025, 5:57 PM

Shouldn't such a feature allow setting the read/write permission per FIB instead of a single all or nothing flag?

This looks good to me, but I'm hoping someone from networking will chime in.

In which application scenarios could allowing jailed processes to take command of system's routing tables be considered useful / desirable?

In D49843#1136953, @zec wrote:

In which application scenarios could allowing jailed processes to take command of system's routing tables be considered useful / desirable?

the intended use case (or at least, my intended use case) is running a routing daemon such as BIRD in a service jail; see the related diff D49844 which adds the svcj side of this.

Shouldn't such a feature allow setting the read/write permission per FIB instead of a single all or nothing flag?

is there a use case for allowing a daemon to modify one routing table but not another? since PRIV_NET_ROUTE is not fib-specific, supporting this would require a significant amount of new code in a sensitive codepath. i'm not opposed to the idea, but i wonder if the effort is worthwhile. (unless perhaps there's already an existing mechanism that could be used for this, and i'm overestimating the complexity?)

Will other routing daemons, which manage not only routing tables, but also system interfaces and their addresses, work 100% happily in this new ALLOW_ROUTING jail, or is this hack BIRD specific? What if BIRD folks one day decide to start adding direct interface management capabilities there, should we then follow up with punching more holes and start adding ALLOW_IFADDR, ALLOW_IFUPDN etc. etc. And if so, where's the boundary, where do we stop?

Let me also ask a different question? What does the jail buy here over a simple chroot (apart from the current system integration of jails)?
In a chroot all of Marko's concerns about managing interfaces etc are basically sorted as well.

i don't consider this a hack; it's simply giving administrators more control over their system, rather than dictating a specific set of permissions they have to take or leave as a whole.

i also don't think it's necessary to define a place to stop. as service jails become more widely used, i'm sure we'll run into more examples of things where it's useful to delegate permissions to jails which currently aren't supported. if you consider that an inherently negative thing, i'd be interested in hearing why; to me, this seems like simply providing more flexibility to users.

that said, i also don't think we need to add every single conceivable permission today. i don't use a routing daemon that manages interface addresses, so i haven't added support for that. if you wanted to do that, i certainly wouldn't object.

In D49843#1136964, @bz wrote:

Let me also ask a different question? What does the jail buy here over a simple chroot (apart from the current system integration of jails)?
In a chroot all of Marko's concerns about managing interfaces etc are basically sorted as well.

an svcj is not actually a chroot; they all run with path=/. adding more filesystem restrictions is something i think could be improved in future, but that's a different topic entirely. for now, chroot and svcjs are entirely orthogonal.

i am approaching this more from the position that, at least in theory, it should be possible to run all services in a service jail unless there's a clear reason why it doesn't make sense to do that. why? because this provides a unified mechanism to restrict system access and resources for any service, on the jail level. compare it to how Linux uses cgroups to run services; on Linux, it's trivial to set resource limits for any service by configuring its cgroup. with jails, we can easily do that for any jail (using rctl) and at the same time, we benefit from the additional security measures of jails (again, similar to how Linux uses namespace to restrict services, but in FreeBSD, security and resource restrictions are both provided by jails).

in future, as jails gain more capabilities, those capabilities will be automatically available to any service via service jails. isn't this, at least in part, why svcj was added in the first place?

Ideally an application that needs to modify routing tables/addresses but wants to reduce its own privileges down that and only that, it should be written with Capsicum in mind and should limits its capabilities. Unfortunately, in the real would we would have 3-rd part applications written with focus on Linux and not supporting Capsicum. A service jail seems like a good enough solution to deal with such applications. Note that it is indeed orthogonal to chrooting.

I am strongly in favor of this change. As Lexi mentioned before, users prefer to run all services in jails. I also wanted to do the same for my BGP routers to run OpenBGPd inside jail. However, there was no permission to edit the main system routing table. There are many softwares that only need to edit the routing table, not anything else (interfaces, their IP addresses, ...), like OpenBGPd and many other routing daemons.
I would like to see this permission implemented in FreeBSD.

ideally everything would be capsicumised, but i think that's also largely orthogonal to this change. as we saw with the bhyve vulnerability, it's still possible to break out of a capsicum sandbox, in which case jails add another layer of protection. and, capsicum doesn't do resource management (since that's not its job).

as we saw with the bhyve vulnerability, it's still possible to break out of a capsicum sandbox

I support changes like this one, but do note that the kinds of kernel vulnerabilities used to escape a Capsicum sandbox (like mishandled reference counts) generally will be equally usable for escaping a jail.

I support changes like this one, but do note that the kinds of kernel vulnerabilities used to escape a Capsicum sandbox (like mishandled reference counts) generally will be equally usable for escaping a jail.

oh, i was actually thinking of https://www.freebsd.org/security/advisories/FreeBSD-SA-24:16.libnv.asc rather than the bhyve vulnerability itself (my understanding is both of those were chained to get full host access).

In D49843#1136966, @lexi_le-fay.org wrote:
In D49843#1136964, @bz wrote:

Let me also ask a different question? What does the jail buy here over a simple chroot (apart from the current system integration of jails)?
In a chroot all of Marko's concerns about managing interfaces etc are basically sorted as well.

an svcj is not actually a chroot; they all run with path=/. adding more filesystem restrictions is something i think could be improved in future, but that's a different topic entirely. for now, chroot and svcjs are entirely orthogonal.

i am approaching this more from the position that, at least in theory, it should be possible to run all services in a service jail unless there's a clear reason why it doesn't make sense to do that. why? because this provides a unified mechanism to restrict system access and resources for any service, on the jail level. compare it to how Linux uses cgroups to run services; on Linux, it's trivial to set resource limits for any service by configuring its cgroup. with jails, we can easily do that for any jail (using rctl) and at the same time, we benefit from the additional security measures of jails (again, similar to how Linux uses namespace to restrict services, but in FreeBSD, security and resource restrictions are both provided by jails).

But if we keep poking holes in the jail boundary to accommodate more and more services, it becomes very hard to make any claims about the security properties of jailing a particular service. At some point we're treating jails as resource containers just because rctl makes that convenient, but there's a marketing problem there.

I don't see any particular problems with this change and don't mean to object to it, though it does feel weird that a jailed process can modify the system routing tables. But as we add more and more escape hatches it becomes impossible to reason about the security benefits of jailing a privileged process (and to be clear, this is already a problem).

in future, as jails gain more capabilities, those capabilities will be automatically available to any service via service jails. isn't this, at least in part, why svcj was added in the first place?

The svcj documentation in rc.conf.5 doesn't say anything about why one might want to run a service in a service jail, and what benefits that confers. I think that's a bug, especially given that the feature uses the term "jail" and not "container", and the former has specific connotations relating to security, at least in FreeBSD. And frankly I'm not sure what added security is obtained from having a privileged daemon run in a jail with path=/.

The svcj documentation in rc.conf.5 doesn't say anything about why one might want to run a service in a service jail, and what benefits that confers. I think that's a bug, especially given that the feature uses the term "jail" and not "container", and the former has specific connotations relating to security, at least in FreeBSD. And frankly I'm not sure what added security is obtained from having a privileged daemon run in a jail with path=/.

https://docs.freebsd.org/en/books/handbook/jails/#service-jails

But if we keep poking holes in the jail boundary to accommodate more and more services, it becomes very hard to make any claims about the security properties of jailing a particular service. At some point we're treating jails as resource containers just because rctl makes that convenient, but there's a marketing problem there.

i am sympathetic to this concern; since jails were originally introduced there's been an assumption that if a process is in a jail, it can do X but it can't do Y, and changing that makes it cognitively more difficult to understand what "this process is in a jail" actually means. we have already violated that somewhat with permissions like adjtime/settime, which are semantically similar to this new routing permission, i.e. they allow the jail to modify something which is usually the concern of the host system.

from a purely technical point of view, i would to see jails become flexible enough that you can configure a jail however you like, including creating the type of "null jails" that have been floated before. this opens the door to doing a lot of interesting things, using the existing jail framework rather than implementing a copy of Solaris process contracts or Linux cgroups. and again, from a technical point of this, i don't see any reasonable objection to this.

in terms of this specific change, i think running a routing daemon in a jail (using svcj or otherwise) provides a significant security benefit because they process a lot of untrusted network data, and that alone is sufficient justification for *this specific* privilege being exposed to jails. as i said earlier, i don't intend to immediately go and add a jail flag for every existing privilege.

so my position would be that i think this change is reasonable on its own merits, but we should also think about whether we want to change (or at least clarify) terminology here going forward.

The svcj documentation in rc.conf.5 doesn't say anything about why one might want to run a service in a service jail, and what benefits that confers. I think that's a bug, especially given that the feature uses the term "jail" and not "container", and the former has specific connotations relating to security, at least in FreeBSD. And frankly I'm not sure what added security is obtained from having a privileged daemon run in a jail with path=/.

i have been mulling over how we can add more restrictions to svcj. i don't think "just use nullfs" is the answer here because that makes everything more complicated, but i don't yet have another proposal. i don't think this is impossible to fix in principle though.

In D49843#1137302, @lexi_le-fay.org wrote:

i have been mulling over how we can add more restrictions to svcj. i don't think "just use nullfs" is the answer here because that makes everything more complicated, but i don't yet have another proposal. i don't think this is impossible to fix in principle though.

I assume you refer to the path=/ part here. There is no generic way of determining what files a given service needs in a lightweight way so that it is simply xxx_svcj=yes. As soon as you specify a list of files which the frameworks puts into its own subtree, it is not lightweight anymore and you are better off to manually jail the service. If you provide an alternate path, you did some manual work before to provide a subtree, and then you are not lightweight in terms of the svcj design of "just do xxx_svcj=yes" anymore, and you are again IMO better off to use a non-svcj jail. The whole idea of service jails comes from the fact of path=/ (while doing the tech review of MWLs jail book... a little section at the end of the book titled "Jails as Control Groups" (SVCJs are not meant as cgroups, but can be used like that)... a nice addition to service jails would be some rctl stuff (via xxx_svcj_rctl maybe)).

I assume you refer to the path=/ part here. There is no generic way of determining what files a given service needs in a lightweight way so that it is simply xxx_svcj=yes.

i think we risk getting off topic here, but what i'm thinking about is services that only need to access a relatively small number of well-defined path names. for example, many services only need to write their pidfile. a database may need to write a pidfile, a data directory and a UNIX socket. this is stuff that can easily be configured in the rc(8) script itself and then used by rc.subr to configure the jail appropriately.

As soon as you specify a list of files which the frameworks puts into its own subtree, it is not lightweight anymore

i don't think svcj should be creating nullfs subtrees for services, that is definitely not in the spirit of the feature. i have some other (somewhat vague) ideas of how we can do this in a better way. but this diff is not the right place to mention those :-)

In D49843#1137302, @lexi_le-fay.org wrote:

from a purely technical point of view, i would to see jails become flexible enough that you can configure a jail however you like, including creating the type of "null jails" that have been floated before. this opens the door to doing a lot of interesting things, using the existing jail framework rather than implementing a copy of Solaris process contracts or Linux cgroups. and again, from a technical point of this, i don't see any reasonable objection to this.

I don't have any objection. I just want to observe that naming is important (and hard). :)

in terms of this specific change, i think running a routing daemon in a jail (using svcj or otherwise) provides a significant security benefit because they process a lot of untrusted network data, and that alone is sufficient justification for *this specific* privilege being exposed to jails. as i said earlier, i don't intend to immediately go and add a jail flag for every existing privilege.

What exactly is the security benefit? The routing daemon is still running as root (since we have no way for a process to drop privileges in a fine-grained manner, so that one retains only PRIV_NET_ROUTE, say) and has full access to the filesystem. Is there a threat model where being jailed makes a significant difference?

I don't have any objection. I just want to observe that naming is important (and hard). :)

i think what i meant to say here is something like, this is a UX/UI problem rather than a technical problem. or in other words i was agreeing with you :-)

What exactly is the security benefit? The routing daemon is still running as root (since we have no way for a process to drop privileges in a fine-grained manner, so that one retains only PRIV_NET_ROUTE, say) and has full access to the filesystem. Is there a threat model where being jailed makes a significant difference?

well, although i haven't tested this, with this change it should also be possible to run the routing daemon in a normal (non-svcj, non-vnet) jail, in which case the path=/ issue isn't a concern. note that in this specific case, while it also currently works to run a routing daemon in a vnet jail (i have tested this extensively) that doesn't achieve the required result since it will modify the vnet's routing table instead of the host's. unless you actually want to modify the vnet's routing table, but the new functionality here is you can jail the routing daemon but still modify the host routing table.

I've been running routing daemons in VNET jails with chroot=/ myself for more than two decades, so have nothing against chroot=/ jails per se. But this is pushing in the opposite direction, having routing daemons running in pseudo-jails with very weak isolation.

We still don't have a clear answer on what exactly is the security benefit of running routing daemons in this "shields-down" form of jail, which would justify making the jail contract less clear and weaker than ever before.

In D49843#1138101, @zec wrote:

We still don't have a clear answer on what exactly is the security benefit of running routing daemons in this "shields-down" form of jail

this diff is not specific to svcjs, which is why i split the change into two commits. consider someone who wants to run BIRD in a 'real' non-path=/ jail for security reasons, but also wants it to modify the host routing table to avoid having to move all their routing into the jail. they cannot use vnet jails because of the second restriction, and they cannot use a non-vnet jail because BIRD in a non-vnet jail can't modify the routing table.

you can *also* use this diff to run BIRD in a svcj, and yes, that was my original motivation for this change, but that's not the only use case.

Repeating that someone might have his mind set on running BIRD in a swiss-cheese-jail is far from providing arguments on what real security benefit would this provide compared to running it in a plain system (or in a chrooted tree).

Consider an exploit in BIRD which would allow routing tables to be manipulated so that only the attacker would retain connectivity to the compromised host, while to the others the whole system would appear to be dead. What exactly does running BIRD in a service jail bring us, compared to running it in the base system?

If this patch hits the tree, which I'm strongly opposed to, at minimum the commit message should clearly state that this is an open-ended redefinition of the jail contract, for a cause no one is able to articulate clearly.

But we seem to be running in circles here, so I give up.

In D49843#1138751, @zec wrote:

Repeating that someone might have his mind set on running BIRD in a swiss-cheese-jail is far from providing arguments on what real security benefit would this provide compared to running it in a plain system (or in a chrooted tree).

Consider an exploit in BIRD which would allow routing tables to be manipulated so that only the attacker would retain connectivity to the compromised host, while to the others the whole system would appear to be dead. What exactly does running BIRD in a service jail bring us, compared to running it in the base system?

It would prevent the manipulation of shared memory segments of other jails if we assume the jail with BIRD was not additionally configured to have full access to shared memory resources. So if someone has a jail host with bird and postgres, the bird instance in the non-jailed case (no matter if chrooted or not) would be able to modify the postgres shared memory segments, whereas in an instance of bird in a jail would not be able to manipulate the shared memory of postgres. A service jail instance of bird would be able to manipulate the filesystem data, so it would be possible to compromise the system, but it would still not be able to silently manipulate the shared memory segment of postgres. Depending on what an attacker may want to do, this is - or is not - a hurdle. IF bird would run chrooted in a service jail, the last phrase can be scratched.

Other things a jailed bird is not able to do are modfying chflags, mount filesystems, read the msgbuf and see processes of other jails. Yes it would be able to manipulate the routing tables, and yes this is not a small feature the attacker would be able to do, but it prevents some access the attacker would gain otherwise. It will not prevent issues, but it can limit the potential issues.

As the author of service jails: the benefit of allow.routing to service jails (if not chrooted) is small (it may slow down an attacker in various ways, depending in the motives). As we have chroot support in the rc framework, providing the possibility to use it for service jails is something we should not deny (IMO). The benefit of allow.routing to a properly non-vnet jailed routing daemon are bigger and add a layer for the security onion I consider worth it to move a routing daemon into a jail if more than just the routing feature shall reside on the system in question (SOHO, homelab, training facilities, ...).

In D49843#1138751, @zec wrote:

Consider an exploit in BIRD which would allow routing tables to be manipulated

consider an exploit in BIRD which would allow an attacker to run code as root.

running BIRD in a jail -> only the jail is compromised.

running BIRD on the host -> the entire host is compromised.

clearly, running BIRD in a jail has an advantage, no?

note, i am talking about 'real' jails here, not svcj.

and i certainly shall not "clearly state that this is an open-ended redefinition of the jail contract, for a cause no one is able to articulate clearly" because

a) this is not open-ended, as i previously explained
b) this is not a redefinition of the jail contract, unless you think allowing jails to run clock_settime is also a redefinition of the jail contract
c) i have clearly articulated the benefit of this change several times, you are simply ignoring me.

before: you cannot run BIRD in a jail.

after: you can run BIRD in a jail.

i don't know how much more clear i can be about this.

(yes, i am aware you can already run BIRD in a vnet jail, but that requires you move your entire routing stack into the jail.)

i am happy to have a reasonable conversation about this, as i have with other reviewers, but you seem determined to oppose this for no apparent reason.

In D49843#1138763, @ivy wrote:
In D49843#1138751, @zec wrote:

Consider an exploit in BIRD which would allow routing tables to be manipulated

consider an exploit in BIRD which would allow an attacker to run code as root.

running BIRD in a jail -> only the jail is compromised.

No, the host is gone as well, since the attacker has control over network connectivity.

In D49843#1138771, @zec wrote:
In D49843#1138763, @ivy wrote:
In D49843#1138751, @zec wrote:

Consider an exploit in BIRD which would allow routing tables to be manipulated

consider an exploit in BIRD which would allow an attacker to run code as root.

running BIRD in a jail -> only the jail is compromised.

No, the host is gone as well, since the attacker has control over network connectivity.

You go to the keyboard of the host, delete the jail, and the attacker is gone (assuming only BIRD was compromised and no transversal credentials where obtained). Everything else on this host is OK (assuming MITM protection in network connections). Yes, getting hold of a routing daemon is bad. No doubts. But the host can be protected while the routing daemon wasn't. The surrounding hosts may not be lost, if there is sufficient MITM protection (e.g. only trusted/validated TLS connections). And if the surrounding systems are jails on this particular host where the bird-jail was, the other jails may not be lost either, only the bird-jail.

In D49843#1138771, @zec wrote:

...

No, the host is gone as well, since the attacker has control over network connectivity.

You go to the keyboard of the host, delete the jail, and the attacker is gone.

Sounds pretty much as a very deep redefinition of the jail contract to me.

In D49843#1138751, @zec wrote:

If this patch hits the tree, which I'm strongly opposed to, at minimum the commit message should clearly state that this is an open-ended redefinition of the jail contract, for a cause no one is able to articulate clearly.

That caveat is present for every single allow.* parameter. Indeed, it is a redefinition of jails that began in 2008 when administrators were allowed to pick and choose which jail features they wanted for their particular uses. Even before that, it was always permissible to create a jail rooted at "/".

It's Swiss cheese all the way down.

In D49843#1138774, @zec wrote:
In D49843#1138771, @zec wrote:

...

No, the host is gone as well, since the attacker has control over network connectivity.

You go to the keyboard of the host, delete the jail, and the attacker is gone.

Sounds pretty much as a very deep redefinition of the jail contract to me.

Apart from jamies comment, How is this different from any other daemon running as root in a jail? If you can take over any externally reachable service in a jail, it is over. You need to delete the jail and start fresh (with a fixed daemon).

In D49843#1138774, @zec wrote:
In D49843#1138771, @zec wrote:

...

No, the host is gone as well, since the attacker has control over network connectivity.

You go to the keyboard of the host, delete the jail, and the attacker is gone.

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Sounds pretty much as a very deep redefinition of the jail contract to me.

Apart from jamies comment, How is this different from any other daemon running as root in a jail? If you can take over any externally reachable service in a jail, it is over. You need to delete the jail and start fresh (with a fixed daemon).

For 25 years, whatever happened in a jail, stayed in a jail, and could not cripple down the host's network connectivity. Now you're throwing this contract away, because you can. Go ahead.

I have .. feelings about this, having spent my time in the routing service trenches and jail trenches too. I mean, heck, we can't even jail the wifi services just yet! :-)

I'm ok with adding the feature. I looked at what control flags/features jail has grown. I understand why. There's some intersection between ye olde full jail support and some more fine grained jail stuff for services that don't yet understand/implement capabilities.

I'm a bit sad though that bird seems to have gone back into one big daemon rather than a bunch of services that talk to the control service that has the privileges.
Heck, the website even mentions they support linux capabilities (CAP_NET_*) but not the BSD ones! Maybe once this has landed we can figure out what the missing gaps are there for adding FreeBSD capability support to bird.

In D49843#1139118, @zec wrote:

For 25 years, whatever happened in a jail, stayed in a jail, and could not cripple down the host's network connectivity.

Why is ip4=inherit ok? Jailed sockets get priority over wildcard sockets on the host, so any socket bound to INADDR_ANY on the host can be overridden by the jail. Certainly there's lots of potential to cripple the host's network connectivity there.

@zec, I'm having a hard time understanding your argument against this change. First of all, what makes the routing table so special? Second, do you not realize that this only adds a per-jail knob that defaults to off? Third, if the routing table is so precious, is that not all the more reason to want to be able to isolate the process in charge of maintaining it? The current situation is that a routing daemon must run as uid 0 in jid 0 and thereby have complete access to the entire system, including all jailed processes. @ivy's change allows us to run the routing daemon with a non-zero jid, so a compromised routing daemon (which can happen, and has in the past) can still screw up the routing table, but can no longer screw up the rest of the system. What makes this security posture acceptable (and desirable) for every other resource in the system but not for the routing table?

This revision is now accepted and ready to land.May 10 2025, 8:24 PM
kevans added inline comments.
usr.sbin/jail/jail.8
713

*channels his inner @bcr* Don't forget to bump .Dd pre-commit.

This revision was automatically updated to reflect the committed changes.