Page MenuHomeFreeBSD

Provide protection against starvation of the ll/sc loops when accessing userpace.
ClosedPublic

Authored by kib on Jun 26 2019, 5:41 PM.

Details

Summary

Casueword(9) on ll/sc architectures must be prepared for userspace constantly modifying the same cache line as containing the CAS word, and not loop infinitely. Otherwise, rogue userspace livelock kernel.

To fix the issue, change casueword(9) interface to return new value 1 indicating that either comparision or store failed, instead of relying on the oldval == *oldvalp comparison. The primitive no longer retries the operation if it failed spuriously. Modify callers of casueword(9) (all in kern_umtx.c) to handle retries, and react to stops and requests to terminate between retries.

On x86, despite cmpxchg should not return spurious failures, we can take advantage of new interface and just return PSL.ZF.

Sponsored by: The FreeBSD Foundation
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt

Test Plan

I only booted on x86 and did some sanity checks like running libthr kyua tests.

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
kib marked 2 inline comments as done.Jun 26 2019, 6:20 PM
kib updated this revision to Diff 59072.

Update ppc inline asm comment.

andrew accepted this revision.Jun 26 2019, 6:28 PM

I'm happy with the arm and arm64 changes.

sys/arm64/arm64/support.S
71 ↗(On Diff #59071)

Extra newline

This revision is now accepted and ready to land.Jun 26 2019, 6:28 PM
mhorne added a subscriber: mhorne.Jun 26 2019, 7:02 PM
kib marked an inline comment as done.Jun 26 2019, 7:03 PM
kib updated this revision to Diff 59074.

Remove extra empty line.

This revision now requires review to proceed.Jun 26 2019, 7:03 PM
pho added a comment.Jun 27 2019, 10:59 AM

I ran all of the threaded tests I have, twice. I also did a buildworld / installworld. This on amd64.
I'll run some i386 tests once my test box is available.

markj added inline comments.Jun 27 2019, 3:02 PM
share/man/man9/casuword.9
110 ↗(On Diff #59071)

Typo in "comparison."

sys/kern/kern_umtx.c
1057 ↗(On Diff #59074)

How can we have rv != 1 at this point?

1089 ↗(On Diff #59074)

I think this block is dead code now. Previously, casueword32 would return 0 even if the comparison failed. Now it returns 1 in that case, but here we know that rv == 0, so it must be that owner == UMUTEX_UNOWNED. Am I missing something?

kib marked 3 inline comments as done.Jun 27 2019, 6:38 PM
kib added inline comments.
sys/kern/kern_umtx.c
1089 ↗(On Diff #59074)

Sure, this is the bug. The block must not be dead, I moved the checks for rv == 1 further. We must handle UMUTEX_CONTESTED there.

kib marked an inline comment as done.Jun 27 2019, 6:40 PM
kib updated this revision to Diff 59112.

Hopefully fix two bugs.

One pointed out by Mark and related to missed handling of contended mutexes. Another reported by Peter where I restarted inner loop of sem2_wait with unbusied chain, while loop expects it busy.

FWIW I have sent links to this change to OpenBSD, NetBSD and DragonFly BSD developers.

mips64 (with CHERI) boots with this patch applied.

sys/kern/kern_umtx.c
2707 ↗(On Diff #59112)

unrelated style change should probably be a separate commit (also the only conflict against CheriBSD)

markj added inline comments.Jun 28 2019, 6:18 PM
sys/i386/i386/copyout.c
440 ↗(On Diff #59112)

Isn't this inverted? atomic_(f)cmpset returns 1 on success, 0 on failure.

kib added inline comments.Jun 28 2019, 6:25 PM
sys/i386/i386/copyout.c
440 ↗(On Diff #59112)

Yes it is, it was reported by Peter earlier. I fixed it in my branch, will upload new patch after pho finishes.

markj added inline comments.Jun 28 2019, 6:52 PM
sys/kern/kern_umtx.c
1159 ↗(On Diff #59112)

Why not just continue?

1169 ↗(On Diff #59112)

This condition is always true.

1218 ↗(On Diff #59112)

It is guaranteed that old == owner here.

1254 ↗(On Diff #59112)

It is guaranteed that old == owner here.

1313 ↗(On Diff #59112)

If error == 0 then we must have owner == UMUTEX_CONTESTED, so (owner & ~UMUTEX_CONTESTED) == 0 is always true.

1378 ↗(On Diff #59112)

If error == 0 we know that old == owner.

1895 ↗(On Diff #59112)

What if rv == 1?

kib marked 9 inline comments as done.Jun 28 2019, 8:45 PM
kib added inline comments.
sys/kern/kern_umtx.c
1313 ↗(On Diff #59112)

error == 0 case can happen when count > 1, in which case we did not executed casueword32() at all. And then either owner == UMUTEX_CONTESTED or this is a dying robust mutex.

I think it would be cleaner just to move the part after second '&&' into MPASS().

kib marked an inline comment as done.Jun 28 2019, 8:46 PM
kib updated this revision to Diff 59172.

Fix i386 casueword.
Handle Mark' notes.
Committed style change.

pho added a comment.Jun 30 2019, 5:56 AM

On i386 I ran a buildworld / installworld + random stress2 tests for a total of 10H30.
On amd64 I ran random stress2 test for 10H30.
No problems seen.

danfe added a subscriber: danfe.Jun 30 2019, 9:17 AM

Observations like "this condition is always true" and "it is guaranteed that old == owner here" should ideally be verified by static code analysis. I'd try to apply the patch and run PVS-Studio over it.

kib added a comment.Jun 30 2019, 11:45 AM

Observations like "this condition is always true" and "it is guaranteed that old == owner here" should ideally be verified by static code analysis. I'd try to apply the patch and run PVS-Studio over it.

In theory it can be, but in reality it would not work by itself. You must provide casueword(9) model to the tool, explaining how oldval/*oldvalp/newval and the return values are related (see patched man page). I know that Coverity has a way to augment semantic of the analyzed program this way, but I do not know how. No idea about PVS.

markj added inline comments.Jul 2 2019, 5:01 PM
sys/kern/kern_umtx.c
1948 ↗(On Diff #59172)

This code is unreachable.

2007 ↗(On Diff #59172)

This condition must be true.

2923 ↗(On Diff #59172)

The line wraps could be committed separately.

2936 ↗(On Diff #59172)

It seems inconsistent that we sleep here.

3118 ↗(On Diff #59172)

If oldstate != state and rv != -1, then rv == 1.

3138 ↗(On Diff #59172)

Same here.

3346 ↗(On Diff #59172)

This test is redundant.

3411 ↗(On Diff #59172)

This should be an else if.

kib marked 7 inline comments as done.Jul 2 2019, 9:37 PM
kib updated this revision to Diff 59330.

Handle reported issues, except the sleep in umtxq_check_susp.
Rebase after style changes were committed.

kib added a comment.Jul 2 2019, 9:40 PM

Original idea of the sleepable vs. non-sleepable umtxq_check_susp() was that existing (pre-patch) calls should be left non-sleepable, while new cases where rv == 1 are sleepable. Apparently this is too strict (wrong) and cases where we loop or own a busy ref on a key should be non-sleepable as well.

I will do the pass over the patch to adjust it according to the stated rules later.

jilles added inline comments.Jul 2 2019, 10:09 PM
sys/arm64/arm64/support.S
69 ↗(On Diff #59172)

Hmm, is the value in w5 overwritten by SET_FAULT_HANDLER?

sys/riscv/riscv/support.S
71 ↗(On Diff #59172)

Hmm, a5 seems to have been overwritten twice with temporary addresses (in EXIT_USER_ACCESS and SET_FAULT_HANDLER).

kib marked 2 inline comments as done.Jul 3 2019, 9:36 PM
kib updated this revision to Diff 59373.

Fix arm64 and riscv asms.
Add a long comment explaining the sleep argument to umtxq_check_susp().
Do pass over the sleep arguments used aligning them with the explanation.

markj added inline comments.Jul 4 2019, 7:25 PM
sys/riscv/riscv/support.S
64 ↗(On Diff #59373)

Suppose we follow the branch. Aren't we going to return the uninitialized contents of a5?

66 ↗(On Diff #59373)

Why not use beqz?

kib marked 2 inline comments as done.Jul 4 2019, 9:28 PM
kib updated this revision to Diff 59427.

Fix issues with risc-v casueword.

markj added inline comments.Jul 5 2019, 9:58 PM
sys/arm64/arm64/support.S
69 ↗(On Diff #59427)

I think w5's contents are uninitialized if we follow the branch to label 2.

90 ↗(On Diff #59427)

Same here.

sys/kern/kern_umtx.c
695 ↗(On Diff #59427)

s/operations/operation/

2162 ↗(On Diff #59427)

The key is busied here, so we cannot sleep.

2238 ↗(On Diff #59427)

Whitespace looks wrong here.

2239 ↗(On Diff #59427)

If rv == 0 we must have owner == UMUTEX_CONTESTED.

kib marked 5 inline comments as done.Jul 6 2019, 4:48 PM
kib added inline comments.
sys/kern/kern_umtx.c
2239 ↗(On Diff #59427)

I re-wrote the logic for rv == 1 completely.

kib marked an inline comment as done.Jul 6 2019, 4:49 PM
kib updated this revision to Diff 59471.

do_lock_pp and arm64 casueword fixes.

markj accepted this revision.Jul 8 2019, 11:24 PM
This revision is now accepted and ready to land.Jul 8 2019, 11:24 PM
kib added a comment.Jul 9 2019, 7:06 AM

Thank you, Mark.

Peter, could you, please, run the tests one more time, hopefully this is the last.

pho added a comment.Jul 9 2019, 7:28 AM
In D20772#452692, @kib wrote:

Thank you, Mark.
Peter, could you, please, run the tests one more time, hopefully this is the last.

Sure, happy to.
A full test on both amd64 and i386, right?

kib updated this revision to Diff 59685.Jul 12 2019, 12:11 PM

Bugfixes after Peter' testing.

This revision now requires review to proceed.Jul 12 2019, 12:11 PM
markj accepted this revision.Jul 12 2019, 2:01 PM
markj added inline comments.
sys/kern/kern_umtx.c
1907 ↗(On Diff #59685)

Typo: signal

This revision is now accepted and ready to land.Jul 12 2019, 2:01 PM