Page MenuHomeFreeBSD

Improve support for refcount waiting
AbandonedPublic

Authored by hselasky on Sep 12 2019, 2:26 PM.
Tags
None
Referenced Files
Unknown Object (File)
Mar 6 2024, 9:24 AM
Unknown Object (File)
Feb 10 2024, 5:44 PM
Unknown Object (File)
Jan 1 2024, 12:47 AM
Unknown Object (File)
Nov 27 2023, 8:20 PM
Unknown Object (File)
Nov 25 2023, 2:22 PM
Unknown Object (File)
Nov 24 2023, 3:50 PM
Unknown Object (File)
Nov 22 2023, 2:29 PM
Unknown Object (File)
Nov 22 2023, 2:28 PM
Subscribers

Details

Reviewers
kib
jeff
markj
mjg
Summary

Implement refcount_releasen_if_not_last().

Make the refcount_wait() function Giant safe and add the appropriate WITNESS asserts.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 26457

Event Timeline

Use "const u_int n" for input "n".

Add missed _refcount_update_saturated() call.

kern/kern_synch.c
348 ↗(On Diff #61976)

Can't this loop be replaced with atomic_readandclear()? Or atomic_swap(), more generally.

I can use atomic_swap() . Is that available for all platforms?

hselasky edited the summary of this revision. (Show Details)

Rebase patch and use atomic_swap_int().

Minor code refactor - lock sleep queue before dropping Giant. Makes code look better.

mjg requested changes to this revision.Sep 12 2019, 6:42 PM
mjg added a subscriber: mjg.
mjg added inline comments.
kern/kern_synch.c
346 ↗(On Diff #61991)

This cannot be legit. If you know for a fact this is the last reference and there can be nobody new adding here, there is no use for the atomic (and probably for the replacement). Given that this is used in generic code I don't think said guarantee is expected.

The previous version of the function correctly just cmpsets - if that fails the counter got increased and it is time to bail. The code as proposed would use these updates.

This revision now requires changes to proceed.Sep 12 2019, 6:42 PM
kern/kern_synch.c
346 ↗(On Diff #61991)

s/use/lose/

kern/kern_synch.c
346 ↗(On Diff #61991)

This is not the last reference. This value race with release_wait() which adds a bit to the counter. I believe atomic_swap_int() is needed in order to avoid race with code in release_wait().

kern/kern_synch.c
346 ↗(On Diff #61991)

s/release_wait/refcount_wait

Catch case where refcount is re-incremented from zero. Noted by mjg@.

Can you describe the consumer visible effect you are trying to achieve? Is it just releasn?

sys/sys/refcount.h
132

Part of the reason for the existing split was to minimize the text impact of the inline. This likely produces significantly more text in the callers.

hselasky added inline comments.
sys/sys/refcount.h
132

The refcount_release_last() function is now in a .c file and is no longer inlined. Or do you mean the rest of the function?

I can refactor the function again, only that I want to be able to use refcount_release_last() like shown here:
https://reviews.freebsd.org/D21621

And not only from refcount_release() (!)

I think it is a good idea to allow the waiter to wake up _after_ the destructor is done, based on the refcount and not before. Current usage of this functionality has no destructor associated with the refcount.

sys/sys/refcount.h
132

In that case this is equivalent to simply calling refcount_release(). If you know that you have the final reference and the structure can not become visible anywhere else then you can simply release your ref at the end of your destructor.

hselasky added inline comments.
sys/sys/refcount.h
132

To clarify a bit,

Do you mean:

refcount_init(&ref, 1);

if (refcount_release_if_not_last(&ref) == false) {

destructor
assert(refcount_release(&ref) == true);

}

???

sys/sys/refcount.h
132

Yes, is this not equivalent.

sys/sys/refcount.h
132

Yes, I can do it that way too. I'll update my patch. Still my other changes like Giant awareness are needed. And I think also the change to add atomic fences to if_not_last will be needed too?

hselasky edited the summary of this revision. (Show Details)

Update patch as per @jeff 's suggestion.

sys/kern/kern_synch.c
381

In the VM I had a case where I wanted to wait but implement the equivalent of SLEEPFAIL to check conditions before waiting again. This is why _sleep() and _wait() were separate.

It may not be necessary to do so but it is worth considering.

414

Eliding REFCOUNT_COUNT() means that you will now sleep simply because there are other waiters who have not been woken even if the count is already zero.

420–426

I find this construction somewhat awkward.

It could simply be "if ((old & waiter) == 0) && !atomic_cmpset()) sleepq_release; continue;

This is more idiomatic and shorter.

sys/sys/refcount.h
155

https://github.com/freebsd/freebsd/compare/master...jwroberson:object_concurrency#diff-23cdae8f0d24538fced60aac2bec3f16R181

I have another generalization of this concept here and a follow-up patch not yet committed that does handle last references. I will post a review later today showing its use. This does really turn more into a semaphore than a refcount.

It may be worth making a general function that inlines into specific functions. Where you can release n refs in the condition that the ref is within some bounds to be specified by caller. For the simple callers the compiler will distill it down.

hselasky marked 2 inline comments as done.

Update code as per Jeff's suggestions.

hselasky added inline comments.
sys/kern/kern_synch.c
381

Likely you'll need to make more refcount_wait functions. One reason is that we might not want to pollute the refcount.h with WITNESS header file dependencies?

414

Fixed in next revision.

420–426

See next version of patch.

sys/kern/kern_synch.c
390

I believe that the flow control of your implementation can be improved if you remove both labels and gotos. Wrap everything in the for(;;) loop, and add one more load of *count in place of goto read_and_retry.

sys/kern/kern_synch.c
392–412

The comment is rather useless.

394

The blank line is not needed.

395

This one is also useless.

399

'first' before what ? I would explicitly said 'before setting REFCOUNT_WAITER'

404

s/:/./

So I had a closer look and think that the introduction of support for waiting was an abuse of this API and should either be moved to another one or get dedicated routines in this one. Most consumers (including very frequently used ones like struct file) don't care for it and avoidably pay the price for its existence.

  1. there is a refcount_wake unnecessary scattered, but hopefully most of the time it is only at the end of the func and has to be jumped to
  2. the range is halved because of the bit needed to know if there are waiters. this means various consumers will have to switch 32 -> 64 bit sooner or abandon the API in the first place

That said, I suggest a different namespace (refcountw?) for this and reverting refcount.h to pre-r351188.

I offer to do the dirty work if necessary.

sys/sys/refcount.h
165

It's a programming error to pass n > old -- someone is trying to drop refs they don't have. This should be an assert.

191

Can you verify if generated assembly for consumers of this routine is unaffected? (after fixing the n > old check)

In D21617#471527, @mjg wrote:

So I had a closer look and think that the introduction of support for waiting was an abuse of this API and should either be moved to another one or get dedicated routines in this one. Most consumers (including very frequently used ones like struct file) don't care for it and avoidably pay the price for its existence.

  1. there is a refcount_wake unnecessary scattered, but hopefully most of the time it is only at the end of the func and has to be jumped to
  2. the range is halved because of the bit needed to know if there are waiters. this means various consumers will have to switch 32 -> 64 bit sooner or abandon the API in the first place

That said, I suggest a different namespace (refcountw?) for this and reverting refcount.h to pre-r351188.

+1 for this. We are now using a single KPI for multiple distinct purposes, and the code and KPI are confusing to me.

I don't know about the name. I think this pattern is similar to a classical barrier, so I would be inclined to use that terminology instead.

I offer to do the dirty work if necessary.

I don't care for the name for the most part, but perhaps something not sounding so refcounty will make it easier to not accidentally use the wrong API. Alternatively, perhaps it's time we de-u_int all consumers and have something of this sort:

struct refcount {
    u_int count;
};

with open-coded access prohibited. The only problem here is that 'struct refcount' already exists in opensolaris code (along with refcount_t) so we will have to use something else.

In D21617#471544, @mjg wrote:

I don't care for the name for the most part, but perhaps something not sounding so refcounty will make it easier to not accidentally use the wrong API. Alternatively, perhaps it's time we de-u_int all consumers and have something of this sort:

struct refcount {
    u_int count;
};

with open-coded access prohibited.

This seems reasonable to me.

The only problem here is that 'struct refcount' already exists in opensolaris code (along with refcount_t) so we will have to use something else.

ZoL renamed it to zfs_refcount_t, we could do the same. Linux has refcount_t AFAIR.

@mjg: what about the addition of refcount_releasen_if_not_last()? Is that part OK?

sys/sys/refcount.h
191

Will do.

sys/sys/refcount.h
165

_refcount_update_saturated() will do the assert() do you want an own case for this perhaps?

I will use the completion() API in the LinuxKPI for my purpose instead. Discussed with @mjg .