Details

Reviewers

markj
mmacy
rlibby
sbahra_repnop.org
kib

Summary

Since I had already worked out the interesting races this was actually surprisingly easy. The idea is that instead of a synthetic write sequence we just use ticks. You will always accumulate 2 ticks of memory. I limit the poll damage by not even checking before then.

The synchronization is a little more subtle because interrupts and context switches are key to understanding why it is safe. If either occurs any speculative loads are lost and the store buffer is flushed. So it acts as the barrier if necessary.

It is about 30% slower than SMR at my write heavy benchmark and consumes 10x as much memory but it is better than both RCU and epoch. I have a timed section loop. Some costs:

RCU/NULL: 2ns (loop + global variable increment)
SMR_LAZY: 3ns
SMR: 6ns
EPOCH: 8ns
EPOCH_PREEMPT: 11ns

so the read side here is pretty close to free. although it does still trade a lot of memory and large bucket caches. Because we store the ticks it is amenable to conversion to a preemptible technique.

Test Plan

passes smrstress, surprisingly.

Diff Detail

Lint

Lint Passed

Unit

No Test Coverage

Build Status

Buildable 29527
Build 27394: arc lint + arc unit

Event Timeline

jeff created this revision.Feb 17 2020, 9:58 PM

Harbormaster completed remote builds in B29461: Diff 68477.Feb 17 2020, 9:58 PM

jeff edited the summary of this revision. (Show Details)Feb 17 2020, 10:03 PM

jeff edited the test plan for this revision. (Show Details)

jeff added reviewers: markj, mmacy, rlibby, sbahra_repnop.org.

jeff set the repository for this revision to rS FreeBSD src repository - subversion.

Herald added a subscriber: imp. · View Herald TranscriptFeb 17 2020, 10:03 PM

jeff added a reviewer: kib.Feb 17 2020, 10:07 PM

Simplify by storing 0 corrected ticks in td_wr_seq. This gives no branches
in the enter section and simpler code elsewhere.

Harbormaster completed remote builds in B29473: Diff 68500.Feb 18 2020, 7:14 PM

I rewrote lazy smr to lazily update s_wr_seq but otherwise use the rest
of the machinery for normal smr. The read side lost a branch and the
code is better contained. ticks now gates s_wr_seq updates similar
to deferred mode. There are some annoying conditions around 0 and
very stale state. In the process I refactored some code because poll()
is getting very long due to comments.

This passes smrstress and umaperf works well. I do intend to commit this.

Harbormaster completed remote builds in B29483: Diff 68514.Feb 19 2020, 2:00 AM

rlibby added inline comments.Feb 20 2020, 6:29 AM

sys/kern/subr_smr.c
204	WR? Did you mean RW, so they could be reset?
234–243	Isn't there an smr_lazy_advance() race here, where A reads ticks=X B reads ticks=X+1, proceeds, advances s->s_wr_seq to X+1 A reads s->s_wr_seq=X+1, and then "advances" (reverts) it back to X and is it a problem? If so, I think we need to read ticks "after" s->s_wr_seq. I don't think we could just use `SMR_SEQ_LEQ(t, s_wr_seq)` as I think that would lead to the idle wrap problem.
sys/sys/smr.h
53–54	Ought to add more parens around a and b.

jeff added inline comments.Feb 20 2020, 6:53 AM

sys/kern/subr_smr.c
234–243	I think you are right. Originally I didn't consider this but it would happen if ticks was ahead of wr_seq on an otherwise idle smr. And an acq barrier with a rel on the cmpset should resolve it. sending wr_seq backwards and immediately polling would leave us considering a goal ticks calculated from the larger value as already expired.

jeff added inline comments.Feb 20 2020, 6:19 PM

sys/kern/subr_smr.c
234–243	I also wanted to add, I tried to write this as: if s->s_ticks == ticks return s_wr_seq monotonically increment s_wr_seq by SMR_SEQ_INCR, s->s_ticks = ticks I could not work out a way to do this safely with only barriers. I could use a 64bit cmpxchg. I could also use 64bit cmpxchg to fix s_rd_seq in a single atomic. I would prefer to keep the flexibility to make the counters 64bit instead though. I think the only s_rd_seq race below would harmlessly rewind it a little bit. If this were userspace and I could make fewer guarantees I would use the 64bit cmpxchg trick.

Name my SMR variant GUS for Global Unbounded Sequences.

Address some feedback by rlibby. Improve some comments.

Harbormaster completed remote builds in B29517: Diff 68622.Feb 21 2020, 12:37 AM

jeff added inline comments.Feb 21 2020, 12:39 AM

sys/sys/smr.h
53–54	still need to do this.

If there are no objections I will commit this version.

Harbormaster completed remote builds in B29527: Diff 68657.Feb 21 2020, 11:20 PM

rlibby added inline comments.Feb 22 2020, 6:46 AM

sys/kern/subr_smr.c
200	Unused...
320	We grew a critical section around smr_default_advance (I assume you're okay with that...), so we can apply Mark's comments from the earlier review and s/counter_u64_add/counter_u64_add_protected/ (all three instances under advance).
480	GT
530–535	I don't understand this. Shouldn't the writer have called smr_advance()? Where did the goal come from if not?
569	s/SMR_SEQ_MAX_ADVANCE/SMR_SEQ_ADVANCE/ ?
575–576	Hmm. I think perhaps this comment should be expanded. I think what you're saying is, for SMR_DEFERRED, we can only be behind a valid goal by SMR_SEQ_INCR. For SMR_LAZY, we can be further behind, but we never have to catch up because we're always !wait. Is that right? Because it does seem possible that we drop out of this with goal > s_wr_seq for the SMR_LAZY case.

I forgot to submit these comments

sys/kern/subr_smr.c
320	That is a good point. I am a little uncertain as to whether advance will continue to use critical. I will evaluate this again once I decide.
530–535	lazy only increments wr_seq lazily after ticks does. The goal is always given as later than the current ticks. So it's always in the future. If no other writers follow wr_seq won't be advanced and the goal will never be satisfied. This makes sure the clock is always moving as long as someone cares.
569	Yes thank you.
575–576	KASSERT(!wait \|\| (zpcpu_get(smr)->c_flags & SMR_LAZY) == 0, ("smr_poll: Blocking not allowed on lazy smrs.")); There is an assert at the top of the function. I'm considering a way to decouple the wr_seq from directly using ticks that would enable me to use IPI to advance if there is a blocking wait for LAZY. This may impact the critcal sections in advance.

jeff abandoned this revision.Feb 27 2020, 8:24 AM

Experimental ticks based SMR.
AbandonedPublic
Actions

Details

Diff Detail

Event Timeline

Revision Contents
Changeset List

Diff 68657

sys/kern/subr_smr.c

sys/sys/_smr.h

sys/sys/smr.h

sys/vm/uma_core.c

Experimental ticks based SMR.AbandonedPublicActions

Details

Diff Detail

Event Timeline

Revision ContentsChangeset List

Diff 68657

sys/kern/subr_smr.c

sys/sys/_smr.h

sys/sys/smr.h

sys/vm/uma_core.c

Experimental ticks based SMR.
AbandonedPublic
Actions

Revision Contents
Changeset List