Page MenuHomeFreeBSD

Add a blocking counter KPI, blockcount(9).
ClosedPublic

Authored by markj on Feb 17 2020, 1:19 AM.

Details

Summary

This adds a new KPI for use by object PIP and busy. In particular, it
replaces the ability to sleep on refcounts. My rationale is that
blocking sleeps of the form

while (*count != 0)
refcount_sleep(count)

are at odds with refcount semantics: once a refcount value reaches 0 the
decrementing thread is the sole owner of the object, so dereferencing
the pointer represents a use-after-free. Moreover, the extra checking
needed for sleepers imposes overhead on all refcount(9) consumers, and
refcount(9) imposes overhead not needed for existing users of the
blocking counter interface. I think it is acceptable to have a small
amount of code duplication in order to separate the two types of users.

I made some opinionated changes to the interface, trying to reduce the
number of interfaces and making them more like our traditional
sleep/wakeup interfaces, for instance by handling PDROP.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

markj created this revision.Feb 17 2020, 1:19 AM
markj updated this revision to Diff 68433.Feb 17 2020, 1:20 AM

Remove unintended diff.

jeff added a subscriber: jeff.Feb 17 2020, 1:41 AM

I don't mind this in principal but I would rather it was a new api entirely. We need relatively few of the refcount features. Just acquiren/releasen/wait. So I'd rather have it named as some kind of barrier or semaphore.

markj added a comment.Feb 17 2020, 2:26 AM
In D23723#521117, @jeff wrote:

I don't mind this in principal but I would rather it was a new api entirely. We need relatively few of the refcount features. Just acquiren/releasen/wait. So I'd rather have it named as some kind of barrier or semaphore.

It is a new API, I just put in refcount.h for convenience while I think through a few details. I wasn't sure about whether it really deserves a separate header but I don't have a strong opinion on it.

I spent a fair bit of time thinking about the name. This primitive doesn't match the way classic barriers or semaphores work. I considered calling it rsem ("reverse semaphore"), but I don't see why that's any better than blockcount. C++ has std::latch which is a more restricted version of this, but that name doesn't seem very descriptive to me. I found a couple of synchronization libraries which implement this as "blocking counter" so I just went with that.

mjg added a subscriber: mjg.EditedFeb 17 2020, 10:05 AM

Thank you for working on this.

I think this is a good opportunity to introduce struct blockcount to prevent accidental misuse. Then BLOCKCOUNT_COUNT can be blockcount_read or so. Some casts will probably also drop.

sys/kern/kern_synch.c
431 ↗(On Diff #68433)

It probably does not matter for runtime but you should add DROP/PICKUP GIANT around sleepq_add/wait. Every other place is supposed to do it.

sys/sys/refcount.h
224 ↗(On Diff #68433)

blockcount overflow?

markj updated this revision to Diff 68528.Feb 19 2020, 3:35 PM
markj marked 2 inline comments as done.
  • Move blockcount code into distinct headers. I did not add a copyright statement yet.
  • Apply mjg's notes: handle Giant, fix assertion message.
  • Add a blockcount_t to make the counter opaque. Add blockcount_read() to allow the counter value to be read (masking off the waiter flag).
  • Add some barriers: blockcount_release() issues a release fence and _blockcount_wait() issues an acquire fence before returning.
  • Avoid re-reading the counter value more than necessary when checking whether to go to sleep.
markj updated this revision to Diff 68529.Feb 19 2020, 3:35 PM

Fix the !INVARIANTS build.

markj added a comment.Feb 19 2020, 3:35 PM
In D23723#521189, @mjg wrote:

Thank you for working on this.

I think this is a good opportunity to introduce struct blockcount to prevent accidental misuse. Then BLOCKCOUNT_COUNT can be blockcount_read or so. Some casts will probably also drop.

I added a blockcount_t, it found a couple of naked references.

I am still open to suggestions regarding the name.

mjg added a comment.Feb 19 2020, 3:51 PM

I only have nitpicks I'm not going to insist on.

sys/kern/kern_synch.c
395 ↗(On Diff #68529)

other blockcount routines use

__func__

instead of hardcoding the name

side note is that probably we shlould just get a KASSERT variant which prefixes everything with func:line or similar

sys/sys/blockcount.h
11 ↗(On Diff #68529)

I think for the kernel it is more idiomatic to shorten 'count' to 'bc' for the variable name, then this also happens to convert the awkward

atomic_add_int(&count->__count, n);

into less awkward (for me)

atomic_add_int(&bc->__count, n);
markj updated this revision to Diff 68541.Feb 19 2020, 4:08 PM
markj marked 2 inline comments as done.
  • Use func consistently in KASSERT()s.
  • Rename "count" to "bc".
mjg accepted this revision.Feb 19 2020, 4:18 PM
This revision is now accepted and ready to land.Feb 19 2020, 4:18 PM
jeff accepted this revision.Feb 19 2020, 6:03 PM
kib added inline comments.Feb 20 2020, 1:11 PM
sys/sys/refcount.h
158 ↗(On Diff #68541)

I failed to see something equivalent for blockcounts, Is semantic comparable ?

markj added inline comments.Feb 20 2020, 1:36 PM
sys/kern/kern_synch.c
395 ↗(On Diff #68529)

Yes, I would like KASSERT() to do this automatically. Perhaps a coccinelle script can be used to fix up the ~9000 existing users to avoid printing the function name twice.

sys/sys/refcount.h
158 ↗(On Diff #68541)

It is similar. blockcount_release() issues a release fence to ensure that all stores are visible before the counter reaches 0. blockcount_sleep() loads the count with an acquire fence before returning.

kib accepted this revision.Feb 20 2020, 1:43 PM
kib added inline comments.
sys/sys/refcount.h
158 ↗(On Diff #68541)

Actually no, it is not. blockcount_release does not return a val, and blockcount_sleep/wait are mandatory.

mjg added inline comments.Feb 20 2020, 2:55 PM
sys/kern/kern_synch.c
395 ↗(On Diff #68529)

coccinelle is a little dodgy for sweeping like this and I don't think it's a good idea. Instead I created D23774.

mjg added inline comments.Feb 20 2020, 3:16 PM
sys/kern/kern_synch.c
398 ↗(On Diff #68541)

With the struct in place this no longer needs to __DEVOLATILE. In fact the macro should probably get augmented to fail compilation if the arg is not volatile.

markj updated this revision to Diff 68637.Feb 21 2020, 4:01 PM

Rebase, add copyrights.

This revision now requires review to proceed.Feb 21 2020, 4:01 PM
mjg accepted this revision.Feb 23 2020, 2:11 PM
This revision is now accepted and ready to land.Feb 23 2020, 2:11 PM
markj updated this revision to Diff 68714.Feb 23 2020, 5:38 PM
  • No need for DEVOLATILE.
  • pip_add may be called with a value of 0, handle this.
This revision now requires review to proceed.Feb 23 2020, 5:38 PM
mjg accepted this revision.Feb 25 2020, 8:23 PM
This revision is now accepted and ready to land.Feb 25 2020, 8:23 PM
This revision was automatically updated to reflect the committed changes.