Page MenuHomeFreeBSD

cred: convert the refcount from int to long
ClosedPublic

Authored by mjg on Mar 22 2023, 9:09 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mar 6 2024, 3:29 AM
Unknown Object (File)
Jan 29 2024, 11:56 PM
Unknown Object (File)
Jan 29 2024, 8:57 PM
Unknown Object (File)
Jan 3 2024, 2:27 AM
Unknown Object (File)
Dec 20 2023, 7:57 AM
Unknown Object (File)
Dec 18 2023, 9:50 PM
Unknown Object (File)
Dec 12 2023, 7:08 AM
Unknown Object (File)
Dec 7 2023, 3:47 PM
Subscribers

Details

Summary
commit 998a1c313a986dc7c7457d0b5369381f65fa1330
Author: Mateusz Guzik <mjg@FreeBSD.org>
Date:   Wed Mar 22 20:42:04 2023 +0000

    cred: convert the refcount from int to long
    
    On 64-bit platforms this sorts out worries about mitigating bugs which
    overflow the counter without pessimizng anything, most notably it avoids
    whacking per-thread operation in favor of refcount(9) API.
    
    The struct already had two instances of 4 byte padding with 256 bytes in
    size, cr_flags gets moved around to avoid growing it.
    
    32-bit platforms could also get the extended counter, but I did not do
    it as one day(tm) the mutex protecting centralized operation should be
    replaced with atomics and 64-bit ops on 32-bit platforms remain quite
    penalizing.
    
    While worries of counter overflow are addressed, the following is not:
    - counter *underflows*
    - buffer overruns from adjacent allocations
    - UAF due to stale cred pointer
    
    As such, while lipstick was placed, the pig should not be participating
    in any beauty pageants.
    
    Reviewed by:
    Differential Revision:

commit ceb8f401fcc5d956b9b92cff6aa6946a932f48bf
Author: Mateusz Guzik <mjg@FreeBSD.org>
Date:   Wed Mar 22 21:44:55 2023 +0000

    cred: make ref signed
    
    There are asserts on the count being > 0, but they are less useful than
    they can be because the type itself is unsigned. The kernel is compiled
    with -frapv, making wraparound perfectly defined.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

mjg requested review of this revision.Mar 22 2023, 9:09 PM
jrtc27 added inline comments.
sys/kern/kern_prot.c
1854 ↗(On Diff #119286)

Please use __SIZEOF_LONG__ or some other similar variable for these; CHERI architectures are neither ILP32 nor LP64 but the native integer size is still 32/64-bit depending on the base architecture (only ever 64-bit in CheriBSD). Though really for the formats I'd be inclined to say just use intmax_t, it's not performance-critical and involves (undocumented) tight coupling between here and ucred.h. Though why not just use long itself?

sys/sys/ucred.h
64

This was previously unsigned, is signed intended? The summary doesn't make any reference to this change.

I'm going to have to sleep on whether to do 64 bit everywhere.

Will post an updated review one way or the other tomorrow.

mjg retitled this revision from cred: make the refcount 64-bit on 64-bit platforms to cred: convert the refcount from int to long.
mjg edited the summary of this revision. (Show Details)
mjg marked 2 inline comments as done.

I have no objection, though I suspect that you cannot change the layout of ucred on stable branches.

Not handling underflow seems like a pretty basic weakness. I don't understand why we can't simply check for over/underflow when manipulating the counter. Yes, it has non-zero cost, but we do similar checking for all refcount adjustments already.

zlei added inline comments.
sys/kern/kern_prot.c
1854 ↗(On Diff #119287)

I see crcowget() is called during thread creating. Can we possibly reach 2^32 threads? That sounds horrible.

Maybe user want be warned when tuning and kern.maxproc * kern.threads.max_threads_per_proc exceed 2^32 .

1861 ↗(On Diff #119287)

I don't understand why we can't simply check for over/underflow when manipulating the counter

+1 for that.

So we might assert a overflow on increase.

KASSERT(cr->cr_ref > 0, ("overflow"));

And bench on a machine with huge amount of resources.

2049 ↗(On Diff #119287)

And assert an underflow on decrease.

KASSERT(cr->cr_ref > 0, ("will underflow"));

And bench on a machine with huge amount of resources.

So we have such a REAL machine with 4096 cpu cores and 8 hardware threads per core, and run average 8192 soft threads per hardware thread, then this machine will have total ~ 2^28 threads. Then a 32bit u_int is still sufficient.

I doubt we will have such machine and real usage in years.

I have no objection, though I suspect that you cannot change the layout of ucred on stable branches.

No, but if someone wants this in stable, they can use the spare fields for the new ref.

Not handling underflow seems like a pretty basic weakness. I don't understand why we can't simply check for over/underflow when manipulating the counter. Yes, it has non-zero cost, but we do similar checking for all refcount adjustments already.

How the refcount API could possibly address it? You drop what it sees like the last reference and past that the object is getting freed. Subsequent attempts at manipulating the counter are pretty UB. There is no leeway here.

sys/kern/kern_prot.c
1861 ↗(On Diff #119287)

That's impossible to do here because the value is completely arbitrary as long as cr_users > 0 -- then the real can only be found by summing up all the counts from all threads.

2049 ↗(On Diff #119287)

see the above comment why this can't work

At best some of the asserts could be converted into runtime checks, but then again the refcount API has no advantage over any of it and has the disadvantage of not scaling.

sys/kern/kern_prot.c
1861 ↗(On Diff #119287)

That's impossible to do here because the value is completely arbitrary as long as cr_users > 0 -- then the real can only be found by summing up all the counts from all threads.

I don't quite understand the statement by summing up all the counts from all threads .

As both cr->cr_users and cr->cr_ref is mutex protected, it is impossible that cr->cr_ref got touched by other threads (if they follows the order lock and write), so I think the overflow (of cr_ref) can be safely and stably caught.

Am I misreading?

That is indeed not enough as I see crunuse() and crunusebatch() also increase cr_ref.
See comments below.

1874 ↗(On Diff #119287)
KASSERT(cr->cr_ref >= td->td_ucredref, ("overflow"));
1902 ↗(On Diff #119287)
KASSERT(cr->cr_ref >= ref, ("overflow"));
sys/kern/kern_prot.c
1861 ↗(On Diff #119287)

i refer you to the comment starting with 'Credential management' in the file

This revision was not accepted when it landed; it landed in state Needs Review.Mar 29 2023, 5:04 AM
This revision was automatically updated to reflect the committed changes.
In D39220#894483, @mjg wrote:

Not handling underflow seems like a pretty basic weakness. I don't understand why we can't simply check for over/underflow when manipulating the counter. Yes, it has non-zero cost, but we do similar checking for all refcount adjustments already.

How the refcount API could possibly address it? You drop what it sees like the last reference and past that the object is getting freed. Subsequent attempts at manipulating the counter are pretty UB. There is no leeway here.

Certainly it cannot be addressed in general, but there are specific bug patterns which can be mitigated this way. For instance:

  1. A thread decrements the reference count to zero but erroneously fails to invoke the object's destructor and free it. If something else later tries to manipulate the counter, it'll stay frozen.
  2. A thread decrements an object refcount to zero and invokes a destructor, which releases other references and invokes other destructors, one of which incorrectly tries to release a reference on the original object (e.g., because some code path incorrectly treats a weak (i.e., not-counted) reference as a strong reference. In this scenario, we'd avoid a double free of the original object.

Such a mitigation might not help in the face of some particular bug, but it's all UB once buggy refcount handling triggers a double free or a UAF anyway.