Page MenuHomeFreeBSD

vmm: Allow the use of PCI passthrough in a jail
AcceptedPublic

Authored by markj on Mon, Feb 2, 11:00 PM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, Feb 3, 10:24 PM
Unknown Object (File)
Tue, Feb 3, 7:21 PM
Unknown Object (File)
Tue, Feb 3, 4:17 PM
Unknown Object (File)
Tue, Feb 3, 8:06 AM
Unknown Object (File)
Tue, Feb 3, 3:24 AM
Unknown Object (File)
Mon, Feb 2, 11:45 PM

Details

Reviewers
corvink
bnovkov
Group Reviewers
bhyve
Jails
Summary

After commit e11768e94787 ("vmm: Add PRIV_DRIVER checks for passthru
ioctls"), it is not possible to use PCI passthru from jails, as
PRIV_DRIVER is not granted to jails. Apparently some users expect this
to work, understanding that jailing bhyve provides little security
benefit in this configuration.

I believe we should disable ppt access in jails even when allow.vmm is
configured. To provide an escape hatch for users, add a new
allow.vmm_ppt jail configuration knob, and check it when handling ppt
ioctls in jails. Also add a new PRIV_VMM_PPTDEV to replace the use of
PRIV_DRIVER.

PR: 292750

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 70355
Build 67238: arc lint + arc unit

Event Timeline

markj requested review of this revision.Mon, Feb 2, 11:00 PM
This revision is now accepted and ready to land.Wed, Feb 4, 7:29 AM
bz added subscribers: jamie, bz.

My historic understanding was that we would have all the pr_* options/check inside kern_jail.c and have accessor functions for them passing td or cred.
I do not know how this evolved over the years (and I do see prison_add_allow()) for other flags so adding #jail to the review list in case @jamie wants to comment on that.

If the status quo changed over the years, just ignore my comment (but it highly worries me if all the loop-holes are starting to be splattered all over the code).
Jails start to feel like someone standing with a corkscrew next to a Swiss cheese.
I wonder when people will/if people should start to use chroot again for system separating when all the security checks are not needed instead of us putting more holes into jails.

In D55066#1259408, @bz wrote:

My historic understanding was that we would have all the pr_* options/check inside kern_jail.c and have accessor functions for them passing td or cred.
I do not know how this evolved over the years (and I do see prison_add_allow()) for other flags so adding #jail to the review list in case @jamie wants to comment on that.

Thanks. I meant to add Jails to the review but apparently failed.

If the status quo changed over the years, just ignore my comment (but it highly worries me if all the loop-holes are starting to be splattered all over the code).
Jails start to feel like someone standing with a corkscrew next to a Swiss cheese.
I wonder when people will/if people should start to use chroot again for system separating when all the security checks are not needed instead of us putting more holes into jails.

Well, the person who reported the problem also noted their motivation for using jails was to take advantage of a separate VNET. So having the ability to configure the VNET of a process, independent of using a jail, would also solve the problem.

I tend to agree that we have too many knobs which weaken the security properties of jails. I'll point out that pre-15.0 it was possible to invoke these ioctls from within a jail that has allow.vmm configured.

In D55066#1259408, @bz wrote:

My historic understanding was that we would have all the pr_* options/check inside kern_jail.c and have accessor functions for them passing td or cred.

Yes, this would be better, but I followed the pattern of the existing allow.vmm knob. I suspect it should be defined in kern_jail.c, not in vmm.ko, so that prison_priv_check() can access it. If there is some consensus on this I'll make that change.

In D55066#1259408, @bz wrote:

My historic understanding was that we would have all the pr_* options/check inside kern_jail.c and have accessor functions for them passing td or cred.

Yes, this would be better, but I followed the pattern of the existing allow.vmm knob. I suspect it should be defined in kern_jail.c, not in vmm.ko, so that prison_priv_check() can access it. If there is some consensus on this I'll make that change.

This is the first use of prison_add_allow outside of kern_jail's own use in adding allow.mount.* for arbitrary filesystems. It's good for any sort of dynamic module permission, so that fits here. But it's still a weird hybrid feel, with a dynamic jail flag on the one hand, and then a hard-coded priv flag on the other. I guess that's just how priv works.

It does seem better to make the prison flag check in prison_priv_check(), but that doesn't allow for a dynamic flag. To put the check there would require adding a static prison.allow parameter for a dynamic module, something I'd rather avoid. So the current method of separating the pr_allow bit and priv_check may be the better option. Or it might be good to formalize that with something like prison_priv_check_allow(cred, priv, flag).