Page MenuHomeFreeBSD

Provide generic sub-word atomic *cmpset
ClosedPublic

Authored by kevans on Sep 27 2019, 4:14 PM.

Details

Summary

Again, naive attempt... this would be broken into three commits, and it seems to work in userland, at least.

Provide *cmpset_{8,16} that proxy through to atomic_fcmpset_32. Initial users will be mips and sparc64, with hopes to wean mips off of it later.

Initially tried to provide these in atomic_common with an #ifdef PLATFORM_NEEDS_SUBWORD_OPS, but this ends up being cleaner with only two archs using it and allows the possibility of KASSERT if we want to do so as systm.h needs bits from machine/atomic.h.

Most of the diff in machine/atomic.h are to refactor definitions of _acq/_rel versions.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

kevans created this revision.Sep 27 2019, 4:14 PM
kib added inline comments.Sep 29 2019, 10:16 AM
sys/sys/_atomic_subword.h
52 ↗(On Diff #62647)

You need () around p.

56 ↗(On Diff #62647)

Perhaps 8 should be NBBY if the symbol is availiable (ignore my note if not).

77 ↗(On Diff #62647)

This should be static inline. Or static (non-inline) in C file.

92 ↗(On Diff #62647)

I think what you describe there is a good reason to not implement cmpset(), only adding fcmpset().

First, look at the atomic(9) note for fcmpset(), where it is mentioned that the function can fail spuriously. Next, lets calculate loops: on ll/sc arch like mips, cmpset already loops in the asm, then you are adding a loop around the call, and then the caller would loop around that.

Hm, I am wrong there, it would be only two nested loops since you use fcmpset() and fcmpset() does not retry, But my point should be clear: do not loop in your implementation, leave it to the caller and allow spurious failures.

kevans added inline comments.Sep 29 2019, 1:51 PM
sys/sys/_atomic_subword.h
56 ↗(On Diff #62647)

Hmm... The odds are good, considering mandatory include order. I'd be tempted to #ifndef ... #define it to 8 for those cases that may have only included sys/types.h to make this use clear.

92 ↗(On Diff #62647)

I have to wonder why the note isn't written to include cmpset. The same caveat about ll/sc applies and the caller has to re-read to detect that. It doesn't note anything to the contrary, such as cmpset being guaranteed to have successfully written. I think it should be amended to clarify intent on cmpset, perhaps.

The mips fcmpset actually will retry on write failure though, from my reading. I'll eliminate the loop either way.

kib added inline comments.Sep 29 2019, 2:11 PM
sys/sys/_atomic_subword.h
92 ↗(On Diff #62647)

I think ARM/ARM64 are much better examples for ll/sc arches, in the sense that more efforts go into them. There, fcmpset family does not retry on sc transient failure, but cmpset retries. (fcmpset on arm64 has seemingly unused '1' label which it got by copy/paste, perhaps).

kevans added inline comments.Sep 29 2019, 2:28 PM
sys/sys/_atomic_subword.h
92 ↗(On Diff #62647)

Yeah, ok- RISCV/ARM/ARM64 all seem to agree on cmpset/fcmpset spinning internally (the former spins, the latter not)

From my reading, cmpset *could* still be implemented with this function and the fcmpset version spawned off and not looping.

Is it better not to provide cmpset in that case? mjg just needs fcmpset, right?

kib added inline comments.Sep 29 2019, 2:35 PM
sys/sys/_atomic_subword.h
92 ↗(On Diff #62647)

Or fix fcmpset on mips to not loop.

kevans updated this revision to Diff 62725.Sep 29 2019, 9:36 PM

An earlier version of the mips fcmpset fix snuck in here... Will fix in another revision, just putting this one out.

Key points:

  • addressed various review comments
  • fixed include guard
  • shuffled systm.h around to allow use of KASSERT in machine/atomic.h ... can scratch this idea if it looks like a terrible one
  • Translated KASSERT to userland assert
kevans added inline comments.Sep 30 2019, 3:33 AM
sys/sys/_atomic_subword.h
168 ↗(On Diff #62725)

This seems to have snuck in during local merge conflicts, detected in testing... Will fix after it's decided whether or not systm.h shuffling is OK.

kib added inline comments.Sep 30 2019, 2:29 PM
sys/sys/systm.h
44 ↗(On Diff #62725)

I would leave systm.h alone and not used KASSERT() in atomic.h. We do not do that on other arches. Also, I suspect that unaligned access would be reported by hw regardless.

kevans updated this revision to Diff 62758.Sep 30 2019, 3:57 PM

Stop mucking with sys/systm.h, drop all assertions and fix the s/HWORD/BYTE/ typo that snuck in.

kib accepted this revision.Oct 1 2019, 2:50 PM
This revision is now accepted and ready to land.Oct 1 2019, 2:50 PM
jhibbits accepted this revision.Oct 1 2019, 3:45 PM
jhibbits added a subscriber: jhibbits.

This will simplify D21682 a bit, as well. PowerPC has sub-word atomics since PowerISA 2.06, so we'll need dual personality there.

mjg added a comment.EditedOct 1 2019, 4:02 PM

I think this looks fine, but would preferably take care of all ports which are currently missing these atomics. There are reviews pending for powerpc and arm but they seem a little stalled. Thus the proposed solution is to just add this for everyone and let maintainers remove it as they provide their own version.

With this in mind, I think a passing a passing make universe with the above addition + D21626 mixed with at least sanity testing by booting would make the patch more qualified for commit.