Page MenuHomeFreeBSD

Experimental ticks based SMR.
AbandonedPublic

Authored by jeff on Feb 17 2020, 9:58 PM.
Tags
None
Referenced Files
Unknown Object (File)
Sun, Apr 7, 2:21 AM
Unknown Object (File)
Feb 16 2024, 5:11 PM
Unknown Object (File)
Dec 20 2023, 7:17 AM
Unknown Object (File)
Dec 15 2023, 9:24 PM
Unknown Object (File)
Nov 8 2023, 10:01 AM
Unknown Object (File)
Nov 7 2023, 9:06 PM
Unknown Object (File)
Nov 7 2023, 11:53 AM
Unknown Object (File)
Nov 5 2023, 12:34 PM
Subscribers

Details

Summary

Since I had already worked out the interesting races this was actually surprisingly easy. The idea is that instead of a synthetic write sequence we just use ticks. You will always accumulate 2 ticks of memory. I limit the poll damage by not even checking before then.

The synchronization is a little more subtle because interrupts and context switches are key to understanding why it is safe. If either occurs any speculative loads are lost and the store buffer is flushed. So it acts as the barrier if necessary.

It is about 30% slower than SMR at my write heavy benchmark and consumes 10x as much memory but it is better than both RCU and epoch. I have a timed section loop. Some costs:

RCU/NULL: 2ns (loop + global variable increment)
SMR_LAZY: 3ns
SMR: 6ns
EPOCH: 8ns
EPOCH_PREEMPT: 11ns

so the read side here is pretty close to free. although it does still trade a lot of memory and large bucket caches. Because we store the ticks it is amenable to conversion to a preemptible technique.

Test Plan

passes smrstress, surprisingly.

Diff Detail

Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 29527
Build 27394: arc lint + arc unit

Event Timeline

jeff edited the test plan for this revision. (Show Details)
jeff added reviewers: markj, mmacy, rlibby, sbahra_repnop.org.
jeff set the repository for this revision to rS FreeBSD src repository - subversion.

Simplify by storing 0 corrected ticks in td_wr_seq. This gives no branches
in the enter section and simpler code elsewhere.

I rewrote lazy smr to lazily update s_wr_seq but otherwise use the rest
of the machinery for normal smr. The read side lost a branch and the
code is better contained. ticks now gates s_wr_seq updates similar
to deferred mode. There are some annoying conditions around 0 and
very stale state. In the process I refactored some code because poll()
is getting very long due to comments.

This passes smrstress and umaperf works well. I do intend to commit this.

sys/kern/subr_smr.c
204

WR? Did you mean RW, so they could be reset?

234–243

Isn't there an smr_lazy_advance() race here, where

  • A reads ticks=X
  • B reads ticks=X+1, proceeds, advances s->s_wr_seq to X+1
  • A reads s->s_wr_seq=X+1, and then "advances" (reverts) it back to X

and is it a problem?

If so, I think we need to read ticks "after" s->s_wr_seq. I don't think we could just use SMR_SEQ_LEQ(t, s_wr_seq) as I think that would lead to the idle wrap problem.

sys/sys/smr.h
53–54

Ought to add more parens around a and b.

sys/kern/subr_smr.c
234–243

I think you are right. Originally I didn't consider this but it would happen if ticks was ahead of wr_seq on an otherwise idle smr. And an acq barrier with a rel on the cmpset should resolve it.

sending wr_seq backwards and immediately polling would leave us considering a goal ticks calculated from the larger value as already expired.

sys/kern/subr_smr.c
234–243

I also wanted to add, I tried to write this as:

if s->s_ticks == ticks return s_wr_seq
monotonically increment s_wr_seq by SMR_SEQ_INCR,
s->s_ticks = ticks

I could not work out a way to do this safely with only barriers. I could use a 64bit cmpxchg. I could also use 64bit cmpxchg to fix s_rd_seq in a single atomic. I would prefer to keep the flexibility to make the counters 64bit instead though. I think the only s_rd_seq race below would harmlessly rewind it a little bit. If this were userspace and I could make fewer guarantees I would use the 64bit cmpxchg trick.

Name my SMR variant GUS for Global Unbounded Sequences.

Address some feedback by rlibby. Improve some comments.

sys/sys/smr.h
53–54

still need to do this.

If there are no objections I will commit this version.

sys/kern/subr_smr.c
200

Unused...

320

We grew a critical section around smr_default_advance (I assume you're okay with that...), so we can apply Mark's comments from the earlier review and s/counter_u64_add/counter_u64_add_protected/ (all three instances under advance).

480

GT

530–535

I don't understand this. Shouldn't the writer have called smr_advance()? Where did the goal come from if not?

569

s/SMR_SEQ_MAX_ADVANCE/SMR_SEQ_ADVANCE/ ?

575–576

Hmm. I think perhaps this comment should be expanded. I think what you're saying is, for SMR_DEFERRED, we can only be behind a valid goal by SMR_SEQ_INCR. For SMR_LAZY, we can be further behind, but we never have to catch up because we're always !wait. Is that right? Because it does seem possible that we drop out of this with goal > s_wr_seq for the SMR_LAZY case.

I forgot to submit these comments

sys/kern/subr_smr.c
320

That is a good point. I am a little uncertain as to whether advance will continue to use critical. I will evaluate this again once I decide.

530–535

lazy only increments wr_seq lazily after ticks does. The goal is always given as later than the current ticks. So it's always in the future. If no other writers follow wr_seq won't be advanced and the goal will never be satisfied. This makes sure the clock is always moving as long as someone cares.

569

Yes thank you.

575–576

KASSERT(!wait || (zpcpu_get(smr)->c_flags & SMR_LAZY) == 0,

	    ("smr_poll: Blocking not allowed on lazy smrs."));

There is an assert at the top of the function. I'm considering a way to decouple the wr_seq from directly using ticks that would enable me to use IPI to advance if there is a blocking wait for LAZY. This may impact the critcal sections in advance.