Page MenuHomeFreeBSD

kern: fix setgroups(2) and getgroups(2) to match other platforms
ClosedPublic

Authored by kevans on Jul 31 2025, 5:29 AM.
Tags
None
Referenced Files
F133221425: D51648.diff
Fri, Oct 24, 2:29 AM
Unknown Object (File)
Tue, Oct 14, 3:26 AM
Unknown Object (File)
Sun, Oct 12, 5:37 PM
Unknown Object (File)
Fri, Oct 10, 5:53 AM
Unknown Object (File)
Fri, Oct 10, 5:53 AM
Unknown Object (File)
Fri, Oct 10, 5:53 AM
Unknown Object (File)
Fri, Oct 10, 5:53 AM
Unknown Object (File)
Fri, Oct 10, 5:53 AM
Subscribers

Details

Summary

On every other platform observed, including OpenBSD and NetBSD, these
system calls have long since been converted to only touching the
supplementary groups of the process. This poses both portability and
security concerns in porting software to and from FreeBSD, as this
subtle difference is a landmine waiting to happen. Bugs have been
discovered even in FreeBSD-local sources, since this behavior is
somewhat unintuitive (see, e.g., fix 48fd05999b0f for chroot(8)).

Now that the egid is tracked outside of cr_groups in our ucred, convert
the syscalls to deal with only supplementary groups. Some remaining
stragglers in base that had baked in assumptions about these syscalls
are fixed in the process to avoid heartburn in conversion.

For relnotes: application developers should audit their use of both
setgroups(2) and getgroups(2) for signs that they had assumed the
previous FreeBSD behavior of using the first element for the egid. Any
calls to setgroups() to clear groups that used a single array of the
now or soon-to-be egid can be converted to setgroups(0, NULL) calls to
clear the supplementary groups entirely on all FreeBSD versions.

Relnotes: yes (see last paragraph)

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 66029
Build 62912: arc lint + arc unit

Event Timeline

Some preliminary comments on literal stuff. I've also started reviewing all callers of setgroups(). It looks like a number of them (mostly under contrib and tools/test/stress2) should also be changed in order to preserve current behaviors as much as possible. Of those changed above, I've only reviewed more thoroughly the newgrp.c one so far, which I think should be amended to preserve its current behavior (although the latter is probably nonsensical). I'm uploading my small current patch at: https://people.freebsd.org/~olce/Phabricator/kevans-D51648_diff_159484-1.patch (only compilation has been tested).

I'll be off the next two days, so unfortunately will come back to it on Sunday at the earliest.

lib/libsys/getgroups.2
106

Not sure about the English nuances, but is "Until" really clear about not including "15.0"? If not, would using "Before" be more precise?

110

Consistency.

111

I would add this to be extra precise about the fact that the effective group ID was first, but could also appear later if it was also part of the supplementary groups.

lib/libsys/setgroups.2
89

Same question as above.

93
95–100

Maybe discard that part, as it could give the impression that the element in the passed array is modified (which of course cannot happen as the passed array is necessarily copied, but still)? Or rework it?

sbin/hastd/subr.c
211

Apparently, "supplemental" is much more common than "supplementary", but the latter is POSIX's official terminology, so we should stick to it.

lib/libsys/getgroups.2
106

I thought so (and is in my experience), but I would accept that it could be a regional thing. Your questioning it would be sufficient proof that it should change in this case, IMO.

Some preliminary comments on literal stuff. I've also started reviewing all callers of setgroups(). It looks like a number of them (mostly under contrib and tools/test/stress2) should also be changed in order to preserve current behaviors as much as possible. Of those changed above, I've only reviewed more thoroughly the newgrp.c one so far, which I think should be amended to preserve its current behavior (although the latter is probably nonsensical). I'm uploading my small current patch at: https://people.freebsd.org/~olce/Phabricator/kevans-D51648_diff_159484-1.patch (only compilation has been tested).

Can you expand a bit on your reasoning for the newgrp change? I didn't see a reason to keep it, and looking around a bit more, POSIX says this on the matter:

On systems where the effective group ID is normally in the supplementary group list (or whenever the old effective group ID actually is in the supplementary group list):
 - If the new effective group ID is also in the supplementary group list, newgrp shall change the effective group ID.
 - If the new effective group ID is not in the supplementary group list, newgrp shall add the new effective group ID to the list, if there is room to add it.

On systems where the effective group ID is not normally in the supplementary group list (or whenever the old effective group ID is not in the supplementary group list):
 - If the new effective group ID is in the supplementary group list, newgrp shall delete it.
 - If the old effective group ID is not in the supplementary list, newgrp shall add it if there is room.

It doesn't really carve out an exception for retaining the group if it was the previous egid, and I can't really picture why we would want to either.

sys/kern/kern_prot.c
360

While I'm thinking about it: sys_getgroups and sys_setgroups probably need to grow a p_osrel check < 15 and do the effective GID things as appropriate to avoid creating issues on old-base jails, to be safe.

Can you expand a bit on your reasoning for the newgrp change? I didn't see a reason to keep it, and looking around a bit more, POSIX says this on the matter:

My only reason to keep it was to avoid changing newgrp's behavior in this commit, potentially leaving that for subsequent ones, given that I don't think that newgrp is POSIX-compliant (with or without this change). But that's not a strong objection.

On systems where the effective group ID is normally in the supplementary group list (or whenever the old effective group ID actually is in the supplementary group list):
 - If the new effective group ID is also in the supplementary group list, newgrp shall change the effective group ID.
 - If the new effective group ID is not in the supplementary group list, newgrp shall add the new effective group ID to the list, if there is room to add it.

On systems where the effective group ID is not normally in the supplementary group list (or whenever the old effective group ID is not in the supplementary group list):
 - If the new effective group ID is in the supplementary group list, newgrp shall delete it.
 - If the old effective group ID is not in the supplementary list, newgrp shall add it if there is room.

Actually, I find POSIX terribly unclear here, and probably "wrong".

The reading of other passages of the standard (in particular, getgroups() IIRC) made me think that, originally, "supplementary groups" could include the effective GID. However, the current definition in XBD seems to rule out this possibility. In particular, I highly suspect that the quoted text for newgrp above was written assuming the earlier definition. So, before your changes, FreeBSD was in the first case ("the effective group ID is normally in the supplementary group list"), and after is in the second case ("the effective group ID is not normally in the supplementary group list"). The parentheses ("(or whenever the old effective group ID...)") seem to add to the confusion. I cannot make any sense of them except as an alternative exclusive to the other conditions.

Removing the new egid from the supplementary group list (with the current definition) is actually harmful, as one of the use cases is to enforce that some user is still part of that group even if its egid changes by duplicating that original egid in the supplementary group list. E.g., executing a setgid executable afterwards could actually allow the user to completely get rid of that group, even when the administrator did not intend that to happen. In other words, this business of removing the egid from the supplementary group list seems to make sense only when the egid is actually the first group in the list (or more generally, is added to the list, without removing a possible duplicate), but not when the egid is separate. Adding the old egid to the supplementary group list seems less harmful, but I still do not see the need for doing that at all (and is in fact not necessary if the original egid has been duplicated in the supplementary group list).

So, to me, newgrp should stop fiddling with the supplementary group list at all, even if this is not standards compliant. That said, I do not expect newgrp to be very much used. I personally learned about its existence by accident only some months ago.

sys/kern/kern_prot.c
360

Indeed. Good catch.

Can you expand a bit on your reasoning for the newgrp change? I didn't see a reason to keep it, and looking around a bit more, POSIX says this on the matter:

My only reason to keep it was to avoid changing newgrp's behavior in this commit, potentially leaving that for subsequent ones, given that I don't think that newgrp is POSIX-compliant (with or without this change). But that's not a strong objection.

On systems where the effective group ID is normally in the supplementary group list (or whenever the old effective group ID actually is in the supplementary group list):
 - If the new effective group ID is also in the supplementary group list, newgrp shall change the effective group ID.
 - If the new effective group ID is not in the supplementary group list, newgrp shall add the new effective group ID to the list, if there is room to add it.

On systems where the effective group ID is not normally in the supplementary group list (or whenever the old effective group ID is not in the supplementary group list):
 - If the new effective group ID is in the supplementary group list, newgrp shall delete it.
 - If the old effective group ID is not in the supplementary list, newgrp shall add it if there is room.

Actually, I find POSIX terribly unclear here, and probably "wrong".

The reading of other passages of the standard (in particular, getgroups() IIRC) made me think that, originally, "supplementary groups" could include the effective GID. However, the current definition in XBD seems to rule out this possibility. In particular, I highly suspect that the quoted text for newgrp above was written assuming the earlier definition. So, before your changes, FreeBSD was in the first case ("the effective group ID is normally in the supplementary group list"), and after is in the second case ("the effective group ID is not normally in the supplementary group list"). The parentheses ("(or whenever the old effective group ID...)") seem to add to the confusion. I cannot make any sense of them except as an alternative exclusive to the other conditions.

I think the first clause could use some re-wording, since it doesn't seem to take into account the possibility that the egid was duplicated into the supplementary group list intentionally. The second one is more clear if you view the parenthesized bit as a heuristic to determining if you're on an affected system, since the old egid not appearing is sufficient.

Removing the new egid from the supplementary group list (with the current definition) is actually harmful, as one of the use cases is to enforce that some user is still part of that group even if its egid changes by duplicating that original egid in the supplementary group list. E.g., executing a setgid executable afterwards could actually allow the user to completely get rid of that group, even when the administrator did not intend that to happen. In other words, this business of removing the egid from the supplementary group list seems to make sense only when the egid is actually the first group in the list (or more generally, is added to the list, without removing a possible duplicate), but not when the egid is separate. Adding the old egid to the supplementary group list seems less harmful, but I still do not see the need for doing that at all (and is in fact not necessary if the original egid has been duplicated in the supplementary group list).

Right, that makes sense.

So, to me, newgrp should stop fiddling with the supplementary group list at all, even if this is not standards compliant. That said, I do not expect newgrp to be very much used. I personally learned about its existence by accident only some months ago.

Now that you mention it, I'm struggling to find a use-case in which I actually care about newgrp. It would be much more useful for me if it were setuid (just to update groups of an existing session to match changes to /etc/group), but it is not by default (and I wouldn't really argue for changing that).

kevans marked 7 inline comments as done.

Adopt olce's suggestions

Preemptively add a 'co-authored-by' credit, because I have a gut feeling we're
going to be going back and forth on this one a bit more yet to perfect it

Add a p_osrel check to setgroups/getgroups to do the egid stuff, just to keep
compatibility. We don't seem to typically gate these behind COMPAT_FREEBSD*,
but given the authorization/security nature I thought it might be wise to allow
it to be disabled with the natural suspect. I'm not 100% fond of how I
implemented these, open to suggestions.

sys/kern/kern_prot.c
360

First, p_osrel check only checks the binary. If the call comes from dso, we loose.
Second, it is huge ABI break, with security implications.

I suspect that you need a new syscall. But even then, this does not solve the issue of silently changing the behavior.

sbin/hastd/subr.c
211

Yes, I keep fighting it because 'supplementary' doesn't feel natural to me in this context, but clearly I slip. =-)

sys/kern/kern_prot.c
360

Ah, I forgot about this. That can be arranged.

re: silently changing the behavior, the point of this series was that this would be largely regarded as a good thing. We're the only ones that do this in 2025, so fixing these to avoid using groups[0] as the egid would likely fix more than it would break in the long run.

Version bump for setgroups/getgroups:

  • compat syscalls; freebsd14 variants maintain the previous logic
  • libc version bumps, old ones use the freebsd14 syscalls as done for, e.g., freebsd13_swapoff
sys/kern/kern_prot.c
360

Amending this: I've since discovered that macOS actually does include an egid, but that's omitted from their documentation(!). I have amended the commit message accordingly. The others are observed to not include/touch the egid.

This revision is now accepted and ready to land.Aug 7 2025, 12:42 AM