Page MenuHomeFreeBSD

kern: fix a panic in crcopysafe() found by syzkaller
ClosedPublic

Authored by kevans on Thu, Jul 31, 7:45 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sat, Aug 9, 5:22 AM
Unknown Object (File)
Fri, Aug 8, 11:56 PM
Unknown Object (File)
Fri, Aug 8, 10:51 PM
Unknown Object (File)
Fri, Aug 8, 11:19 AM
Unknown Object (File)
Mon, Aug 4, 11:10 PM
Unknown Object (File)
Mon, Aug 4, 6:08 PM
Unknown Object (File)
Mon, Aug 4, 5:54 PM
Unknown Object (File)
Mon, Aug 4, 11:54 AM
Subscribers

Details

Summary

crcopysafe() attempts to crextend() the new ucred's group allocation to
fit all of the groups from the old ucred, but for certain values of
ngroups_max we can end up with a larger-than-max cr_asize as crextend()
rounds up the allocation to the next power-of-2.

This was not a problem before be1f7435ef218b1 because the effective max
storage was NGROUPS_MAX + 1 (1024) to account for the egid being
included in cr_groups. Now that we're back down to NGROUPS_MAX, the max
allocation will tend to be 1024 and exceed our max groups.

Switch crcopysafe() to extend until we have enough allocated to fit
the previous group set, and call crextend() with the number of groups
that the old ucred had. This avoids relying on implementation details
of crextend() up-sizing our requests and ensures we only have as large
of an allocation as we need to fulfill the request.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 65870
Build 62753: arc lint + arc unit

Event Timeline

For what it's worth, this seems to fix a C reproducer that my local syzkaller instance generated, uploaded here: https://people.freebsd.org/~markj/setgroups-syz-repro.c

I've restarted the fuzzer with this patch.

For what it's worth, this seems to fix a C reproducer that my local syzkaller instance generated, uploaded here: https://people.freebsd.org/~markj/setgroups-syz-repro.c

I've restarted the fuzzer with this patch.

Excellent, thanks. We should probably bump NGROUPS_MAX back up to 1024 since we're not accounting for the egid anymore, but OTOH I'm not upset that it found a tiny 'whoops' like this.

Instead of clamping cr_agroups, I would slightly prefer changing crcopysafe() to:

  1. Stop trying to extend the new credentials as soon as cr->agroups >= oldcred->cr_ngroups.
  2. Just call crextend() with oldcred->cr_ngroups.
sys/kern/kern_prot.c
2831–2836

I'm not sure if it's worth speculating on how we would do it, since IIRC the fact that ngroups_max is a constant is ingrained in all credentials code related to groups.

But for the exercise, if we did change that, then basically we wouldn't be able to check that the requested length is under ngroups_max without proper synchronization. Additionally, we might not be able to clamp all already existing creds to get rid of surnumerary groups, in which case we would have to support the ability to copy such credentials (even if there are too many groups with respect to the current ngroups_max), or at the very least to read them without asserting.

I think this slightly hints at the alternate change proposed in the general comment. (And also, for debugging, I think I prefer that cr_agroups is the actual maximum number of groups that could be stored.)

Also, I think that the first paragraph of the commit message is slightly unclear when reading quickly. In "crcopysafe() attempts to crextend() the new ucred's groups up to the size of the old ucred's", "size" at the end seems to refer to the number of groups of the old creds, not the actual allocated size. (I agree we usually use "size" instead of "length" for precision, in which light this is the correct terminology, but I'm not sure there's enough context here to infer that quickly).

Instead of clamping cr_agroups, I would slightly prefer changing crcopysafe() to:

  1. Stop trying to extend the new credentials as soon as cr->agroups >= oldcred->cr_ngroups.
  2. Just call crextend() with oldcred->cr_ngroups.

Hmm, fair, I like that.

Fix crcopysafe() directly, don't extend all the way up to agroups (which could
be larger than necessary for multiple reasons) call crextend() with the number
of groups we expect to hold rather than the number we expect to allocate.

I have some tests that I've written separately, but they're not quite ready to
for review -- currently I test both the setgroups(0, NULL) behavior and also
allocating at {NGROUPS_MAX} (with some wiggle room to account for historical
behavior and allowing + 1 for the egid), I'm trying to decide if I want to add
any more.

I'm still finding the commit message's first paragraph a bit confusing. How about something along the lines of:

crcopysafe() attempts to crextend() the new ucred's group allocation with the number of allocated group slots (`cr_asize`) from the ucred to copy rather than the latter's actual number of supplementary groups.  However, the number of allocated group slots can exceed `ngroups_max` for certain values of it (because of rounding to the next power-of-2 or page on allocation), making `crextend()` trip on a check that the passed value should be lower than `ngroups_max`.
This revision is now accepted and ready to land.Mon, Aug 4, 8:47 PM

I'm still finding the commit message's first paragraph a bit confusing. How about something along the lines of:

crcopysafe() attempts to crextend() the new ucred's group allocation with the number of allocated group slots (`cr_asize`) from the ucred to copy rather than the latter's actual number of supplementary groups.  However, the number of allocated group slots can exceed `ngroups_max` for certain values of it (because of rounding to the next power-of-2 or page on allocation), making `crextend()` trip on a check that the passed value should be lower than `ngroups_max`.

I have no objection, accepted as-is and replaced