In D49395#1126370, @kib wrote:MNT_NOEXEC was never a barrier to prevent a code execution from the files on the mount. It only prevented image activators from parsing files on the volume and using them at execve(2) time.
Then some features like security theater started come in, for instance, I added a check that mmap(PROT_EXEC, fd) does not trivially avoid MNT_NOEXEC protection, see ecc6c515aba41a24668ddd.On the other hand, we only check MNT_NOEXEC in stock rtld when loading libraries into setuid process. I believe that rtld can load dso from posix shmfd, or can even execute posix shmfd in direct mode. There, the check for MNT_NOEXEC is simply not applicable.
So, if you consider a sealed system where all writeable mounts have NOEXEC option set, I do not think that either patched or stock system would prevent arbitrary code execution from non-privileged local user:
- a dso or binary can be copied to posix shmfd and code there executed using by direct mode or LD_PRELOAD_FDS
- dso can be created that does not contain executable segments but either DT_INIT/DT_INITARRAY/DT_PREINITARRAY (constructors) or destructors pointers are relocated in a way similar to what you described as 'attack' on the main binary entry point
- non-plt ifunc relocations can be used in the similar way
- ... I stopped trying to enumerate more ways to break the sealing after I realized that ifunc are there.
- Queries
- All Stories
- Search
- Advanced Search
- Transactions
- Transaction Logs
Advanced Search
Mar 18 2025
Nov 7 2023
Would it be wise or a good idea to integrate Capsicum?
Jun 1 2022
In D35349#801990, @kib wrote:In D35349#801982, @mindal_semihalf.com wrote:Sorry for the radio silence. I discovered that this patch, in its current form breaks Linuxulator VDSO clock routines.
Basically the problem is that the Linux VDSO glue code needs to read vdso_timekeep, that is stored in the shared page.
I have to figure out a fix for this first, before proceeding with this.
Once I have something I'll either open up a new phabricator revision, or update this one.It is probably between too hard to impossible, due to linux vdso written in C. It requires PIC code, and it is almost always requires GOT and performing relocations before the code can work.
IMO it is enough to exclude linux ABI from shared page randomization.
Nov 19 2021
In D33044#746877, @glebius wrote:Also, libwrap is quite a seasoned open source software, that doesn't see any updates. Chances for a new vulnerability I would estimate as very low.
Oct 18 2021
In D27666#734443, @mw wrote:Hi,
I'm refreshing the discussion. The current status is following:
- Fixes for the outstanding bugs ntpd (https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=253208) and Firefox/Thunderbird (https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=239873) landed on Friday. Hopefully this will cover all cases that might have remained unknown until now.
Jan 21 2019
From glancing at this patch, there's a few code security-related issues that need to be worked out.
Jan 7 2019
Jan 4 2019
In D18423#392118, @kib wrote:In D18423#392116, @emaste wrote:Presumably we should build INTERNALLIBs and PRIVATELIBs as both libfoo.a and libfoo_pic.a (or some similar scheme) and choose the appropriate one when linking the binary. I don't know how far down that path we want to go though.
Yes, I argued that this is the only correct way. We must not build normal libXXX.a with PIC, but we do need libXXX_pic.a. Also, formally architectures can have different -fPIC and -fPIE ABIs, so perhaps we really need libXXX_pie.a, and e.g. libc_pic.a and libc_pie.a simultaneously.
Dec 15 2018
In D18574#395972, @rgrimes wrote:In D18574#395968, @cem wrote:In D18574#395967, @rgrimes wrote:I do not like changing defaults without very good reasons, what reasons are there for making this change?
If you have not yet done so, please read the summary included in the review. If you have read it and still have this question, it could probably stand to be spelled out more clearly.
I have read it, I also run many vm's in bhyve, I use tap devices, I reboot my vm's now and then and have no issues. So AGAIN what reasons are there for making these changes and spell it out.
Nov 27 2018
LGTM
In D18319#389678, @jamie wrote:Sorry for a post-acceptance note, but on trying it out I noticed that jails are created by default with allow.nounprivileged_proc_debug. That's an easy fix - the bit needs to be added to PW_DEFAULT_ALLOW in kern_jail.h. I'm apparently unable to change the diff in this revision, so instead of creating a new revision I'll just mention that's what I'll be committing.
With that change, things work as expected: a new jail has user-level debugging if its parent does, and doesn't if its parent doesn't. Once the jail is created, the sysctl will change the behavior only inside that jail.
Nov 25 2018
Note that I'm not a committer, so I can't commit the patch. ;)
Nov 24 2018
In D18319#389013, @jamie wrote:In D18319#388858, @lattera-gmail.com wrote:Because of this, having the check in sys/kern/kern_priv.c is the right place. There's no real need to duplicate the logic to prison_priv_check. I can still add it, if you want, but I believe it would be a waste of cycles.
Good point. No need to do it twice, since prison_priv_check() only exists to be called by priv_check_cred().
Update jail.8 manage to document allow.unprivileged_proc_debug
In D18319#388857, @lattera-gmail.com wrote:In D18319#388815, @jamie wrote:priv_check_cred() in kern_priv.c isn't the right place to make the check, but prison_priv_check() in kern_jail.c. PRIV_DEBUG_UNPRIV is already in that function's list, in the part that lets jails do things, and it needs to be moved to the bottom part of the function where you'll see a number of other cases where a certain privilege checks a certain pr_allow bit.
It appears it needs to be in both places, since the very top of prison_priv_check has this:
if (!jailed(ucred)) return (0);
In D18319#388815, @jamie wrote:priv_check_cred() in kern_priv.c isn't the right place to make the check, but prison_priv_check() in kern_jail.c. PRIV_DEBUG_UNPRIV is already in that function's list, in the part that lets jails do things, and it needs to be moved to the bottom part of the function where you'll see a number of other cases where a certain privilege checks a certain pr_allow bit.
In D18319#388815, @jamie wrote:priv_check_cred() in kern_priv.c isn't the right place to make the check, but prison_priv_check() in kern_jail.c. PRIV_DEBUG_UNPRIV is already in that function's list, in the part that lets jails do things, and it needs to be moved to the bottom part of the function where you'll see a number of other cases where a certain privilege checks a certain pr_allow bit.
Implement Jamie Gritton's suggestions. Use the priv_check API for checking the underlying debug privilege. Make brief that which is verbose.
In D18319#388774, @jamie wrote:OK, if the jail needs to have that bit set before anything is run, then yes it needs to be a parameter.
As far as its semantic fit with allow.*, I wonder if there's even a case for the jailed root changing this permission. What are the ramifications of leaving CTLFLAG_PRISON off of security.bsd.unprivileged_proc_debug?
Nov 23 2018
In D18319#388761, @jamie wrote:Since this bit is under the full control of the prison itself, does it belong in pr_allow? On the plus side, that lets the system create a jail with this turned on, but that can be just as easily done in the jail's sysctl.conf.
Nov 17 2018
Sep 7 2018
Sep 4 2018
In D16822#361197, @jhb wrote:This looks right to me now. I'll try to test it locally in the next day or so.
Aug 28 2018
Minimize diff with suggestions by jhb.
In D16822#361129, @kib wrote:The only useful feature of the phabricator is to easily see context around the patch, which you successfully botched.
Reflect changes requested by both kib and jhb.
Missed a spot. Cover another mapping with MAP_GUARD.
Update the patch to use John Baldwin's suggestion on mapping the entire range first with MAP_GUARD.
In D16822#361106, @jhb wrote:Adding @kib since he added MAP_GUARD. I think you should instead MAP_GUARD the entire range first and then remap the middle. The middle is already remapped on line 518, so instead of adding new mmap()'s just for the guards, you should replace the mmap() on line 496 with this:
base = mmap(NULL, len2, PROT_NONE, MAP_GUARD | MAP_ALIGNED_SUPER, -1, 0);
Aug 21 2018
Jul 30 2018
Put the allow.vmm documentation in the right place in the jail(8) manpage.
Address the superfluous conditional and add an entry into the jail(8) manpage.
In D16057#350287, @jamie wrote:One more thing to do: jail(8) should mention the flag. There's a section about module-specific flags where I think it would fit better than the main allow.* section.
Update the patch to take into account the new dynamic allow.* API.
Jul 19 2018
Jul 6 2018
In D16057#342445, @jamie wrote:I've added D16146, which makes a new allow.* bit easy:
flag = prison_add_allow(NULL, "foo", NULL, "Jailed user may do the foo thing");
Jul 5 2018
In D16057#342285, @jamie wrote:In addition to the question of where to check the permissions, there's also the issue that the allow.vmm parameter shouldn't exist in a non-VMM system. This means the SYSCTL_JAIL_PARAM should be defined in vmm_dev.c or some other vmm-related file; that way, if VMM is loaded as a module, the parameter would be attached to that module.
For an example of module-specific parameters, see compat/linux/linux_mib.c (and where the setup is called in linux_common.c). This is for the parameters linux.*, so it's not quite the same. But there's no actual requirement that your parameter be top-level, and could still be allow.vmm.
The problem is there's more code required for this than should be necessary for a simple allow bit. You don't need everything that linux_mib has, but you would need something to set up the parameter on module load. While I don't have support for dynamic allow.* parameters, I do have prison_add_vfs() in kern_jail.c, which is mostly for adding allow.mount.fsname parameters. I should change that a bit to allow for a more generic method of adding an allow.* flag. That would mean you have a dependency on me.
Jul 4 2018
Jun 29 2018
In D16057#340214, @araujo wrote:@lattera-gmail.com first of all thanks for the patch. I'm curious to know what guest OS have you tested inside a jail, could you please share it with me? Also let me know what devices you used, as an example: virtio-blk, virtio-net and etc..
If you can share your tests, it would be perfect.
Thanks.
Rebase on FreeBSD's source code.
Whoops. I just realized this version of the patch is based off of HardenedBSD's src tree. I'll update it soon based on FreeBSD's.
May 7 2018
Closing this review since the issue has been addressed with a different commit.
In D14553#322971, @imp wrote:So there's already this at the end of the file:
MODULE_VERSION(geom_eli, 0);
from r332387.
May 6 2018
It has been quite a few years since I originally wrote this code. This might be able to carve out some time these next two weeks to refactor it. There are parts of this patch that feel a bit awkward and could likely be improved upon.
Friendly ping. :)
Mar 12 2018
Mar 1 2018
Use G_ELI_VERSION from g_eli.h as the module version.
Jan 31 2018
I'm curious: why disable IBPB for userland?
Jan 13 2018
Got a panic, potentially related to bhyve. I've posted the core txt here: https://gist.github.com/f9933cd2397217d6acb83fb1ec1f41e7
This comment brought to you by my HardenedBSD laptop running with your PTI patch and the retpoline patch from llvm. I'm happy to report that everything is working fine, albeit with a noticeable lag in some cases. I have multiple bhyve VMs running in parallel, including Win10. Thank you for fixing that! I did have to rebuild the nvidia-driver port, but that's to be expected. It's probably safe to assume that a good portion of third-party kernel modules will need to be rebuilt.
Jan 11 2018
The OverDrive 1000 booted fine. I don't have any arm64 PoC for Spectre or Meltdown that works on FreeBSD right now.
In D13812#290701, @emaste wrote:Do you have any results yet? This change seems pretty straightforward so I'd suggest @andrew just commits it.
Jan 10 2018
In D13812#289964, @andrew wrote:Should we create a tunable to disable this?
It would also be useful if someone with an A57, e.g. a SoftIron, could test & benchmark this. I have one, but haven't had time to update it to a recent enough kernel.
My FreeBSD bhyve VM with the patch applied can boot with vm.pmap.pti set to 0. However, even with it set to 0, I get weird runtime errors. Like this when running make -sj6 buildkernel:
Jan 9 2018
Attempting to run with vm.pmap.pti=0 set results in this kernel panic in early boot: https://photos.app.goo.gl/bm29U8ChZDAmnF0B3
I generated an installer image (memstick.img) with the latest PTI patch to test a fresh installation of vanilla FreeBSD with the PTI patch applied. Extracting the distsets failed.
I've now reproduced the issues on vanilla FreeBSD:
Jan 8 2018
With the patch applied and actively running on my system, I get "interesting" virtual memory behaviors. Like when running make buildkernel:
Applications randomly won't start, as well. Only one boot out of five did sshd start successfully.
Note that I don't currently have a FreeBSD system as all my systems run HardenedBSD, so the line numbers in the kernel panic backtrace below are with HardenedBSD 12-CURRENT/amd64 with this patch applied to commit 38fc2d5ddfadacba64a8d55932596a3008c8403f in hardened/current/master.
Oct 28 2017
Due to the follow-up conversation with badfilemagic, my last comment should be retracted as well.
The proposed patch would effectively disable all entropy gathering sources by default. Thus, systems would boot up without any entropy, save the cached entropy from last reboot. On freshly installed systems, there is no cached entropy. The state of the entropy subsystem would be subpar.
Aug 28 2017
In D12132#252178, @shurd wrote:In D12132#252164, @markm wrote:Do you mean not harvesting the packet contents?
In the updated patch, it continues to call random_harvest_queue(), but random_harvest_queue() has been modified to fall back to random_harvest_fast() on lock contention instead of spinning. If entropy is being collected at a high rate, random_harvest_queue() will be kept running most of the time, but instead of spinning to wait for the mutex, other threads will now fall back to random_harvest_fast() if the mutex is held when random_harvest_queue() is called.
With this change, the argument to even have a separate random_harvest_fast() is weaker since the spin on the lock is avoided for all harvesting.
Aug 1 2017
Jul 21 2017
In D3848#242229, @pfg wrote:In D3848#242083, @jlh wrote:I think stack protection has already been disabled in the very low level stuff. This change is fairly non-intrusive and I think he ready for further testing. Go ahead and commit please.
There are too many assumptions in the above statements. Oliver, have you tested this? I have only done very light testing so I don't want to assume responsibility.
Also, let me cc kib@ since this may end up affecting rtld.
Jun 8 2017
D10447, the patch this depends upon, needs to be updated to latest HEAD. I'd love to help test this patch out.
It looks like this patch doesn't apply cleanly to latest HEAD. Can this patch be updated to reflect the latest HEAD?
Apr 13 2017
Update to address Allan's comments.
Apr 12 2017
Mar 23 2017
In D10048#207924, @markm wrote:In D10048#207892, @lattera-gmail.com wrote:I can do a pkg exp-run with this patch on HardenedBSD's infrastructure tomorrow if desired.
Yes please!
Mar 19 2017
I can do a pkg exp-run with this patch on HardenedBSD's infrastructure tomorrow if desired.
Feb 8 2017
Patch tested successfully. Other than the typo @lifanov noted, the patch looks good to me.
I'll test this patch sometime within the next 24-72 hours. As it stands, the logic looks completely fine to me. Thanks for working on this!
Dec 10 2016
In D8290#181060, @lattera-gmail.com wrote:I've added the current revision of this patch to a feature branch in HardenedBSD. I'll test it out over the weekend.
Dec 9 2016
I've added the current revision of this patch to a feature branch in HardenedBSD. I'll test it out over the weekend.
In D8290#181049, @robak wrote:What about this guys? How many more bhyve vulns we're going to wait till this could be committed? ;)
Nov 28 2016
Patch tested successfully on my end.
Oct 20 2016
My comments are from simply glancing at the review, not a full review nor tested.
Oct 4 2016
In D5603#168860, @emaste wrote:How expensive is pulling from the entropy pool at least once, at most five times, per call to mmap?
You said it has a measurable performance hit, so I'm curious what that is.
In D5603#168856, @emaste wrote:ASR has a measurable performance hit; ASLR does not.
I'd be quite interested in the detail here if you can provide a link or reference to this.
In D5603#168630, @emaste wrote:In D5603#168627, @lattera-gmail.com wrote:Should the various macros, variables, and enums reflect ASR instead of ASLR?
I'm indifferent on this point. Terms have a tendency to become generic, like thermos or dumpster, and ASLR is arguably in the same state. I don't see complaints over references to ASLR with respect to other operating systems that don't follow the PaX implementation, and I think that for the vast majority of users it is not a meaningful distinction.
Oct 3 2016
Should the various macros, variables, and enums reflect ASR instead of ASLR?