HomeFreeBSD

Handle refcount(9) wraparound.

Description

Handle refcount(9) wraparound.

Attempt to mitigate the security risks around refcount overflows by
introducing a "saturated" state for the counter. Once a counter reaches
INT_MAX+1, subsequent acquire and release operations will blindly set
the counter value to INT_MAX + INT_MAX/2, ensuring that the protected
resource will not be freed; instead, it will merely be leaked.

The approach introduces a small race: if a refcount value reaches
INT_MAX+1, a subsequent release will cause the releasing thread to set
the counter to the saturation value after performing the decrement. If
in the intervening window INT_MAX refcount releases are performed by a
different thread, a use-after-free is possible. This is very difficult
to trigger in practice, and any situation where it could be triggered
would likely be vulnerable to reference count wraparound problems
to begin with. An alternative would be to use atomic_cmpset to acquire
and release references, but this would introduce a larger performance
penalty, particularly when the counter is contended.

Note that refcount_acquire_checked(9) maintains its previous behaviour;
code which must accurately track references should use it instead of
refcount_acquire(9).

Reviewed by: kib, mjg
MFC after: 3 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D21089

Event Timeline

bdrewery added inline comments.
/head/sys/sys/refcount.h
57

Why bother doing anything but panic here? It's not like the system will continue on in a stable state after this point.

/head/sys/sys/refcount.h
57

Why not? The problem will likely manifest as a kernel memory leak, but unless we leak enough memory to bring down the system I don't see what stability problems would ensue.

I do not feel strongly about this though. We could indeed panic the system in !INVARIANTS kernels. The refcount_acquire_checked() KPI exists for situations where a user program may overflow a reference counter via system interfaces, so the panic would be limited to the case where a kernel bug permits reference count overflows.

/head/sys/sys/refcount.h
57

Hm true it would likely just be a leak. I guess I just don't like the fix at this level or that it hides the error. A memory leak or a panic are both a DoS tradeoff to the overflow. I'm unclear on what the overflow case does compared to the new behavior. I would rather see fixes at the caller level to deal with a failure to acquire a new reference and to block or return error from there. The INVARIANTS panic may yield that in over time though if calling code is fixed but it's all a rare case to hit anyway.

/head/sys/sys/refcount.h
57

I'm unclear on what the overflow case does compared to the new behavior.

You mean, what it used to do? Previously the counter would silently wrap around, permitting a referenced object to be freed.

I would rather see fixes at the caller level to deal with a failure to acquire a new reference and to block or return error from there.

That is what kib did with fhold() in r350199. IMO this is not suitable for all uses of refcount(9): in many cases we have no straightforward way to signal an error or block. Moreover, it requires the use of atomic_cmpset instead of atomic_fetchadd, which is more expensive under contention and creates a larger i-cache footprint.

/head/sys/sys/refcount.h
57

Thanks for explaining more.