Page MenuHomeFreeBSD

Provide protection against starvation of the ll/sc loops when accessing userpace.
ClosedPublic

Authored by kib on Jun 26 2019, 5:41 PM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Dec 25, 8:38 PM
Unknown Object (File)
Thu, Dec 19, 11:40 AM
Unknown Object (File)
Fri, Dec 13, 7:41 PM
Unknown Object (File)
Thu, Dec 12, 5:51 PM
Unknown Object (File)
Thu, Dec 12, 1:02 PM
Unknown Object (File)
Mon, Dec 9, 9:39 PM
Unknown Object (File)
Mon, Dec 9, 7:55 PM
Unknown Object (File)
Dec 2 2024, 10:17 PM

Details

Summary

Casueword(9) on ll/sc architectures must be prepared for userspace constantly modifying the same cache line as containing the CAS word, and not loop infinitely. Otherwise, rogue userspace livelock kernel.

To fix the issue, change casueword(9) interface to return new value 1 indicating that either comparision or store failed, instead of relying on the oldval == *oldvalp comparison. The primitive no longer retries the operation if it failed spuriously. Modify callers of casueword(9) (all in kern_umtx.c) to handle retries, and react to stops and requests to terminate between retries.

On x86, despite cmpxchg should not return spurious failures, we can take advantage of new interface and just return PSL.ZF.

Sponsored by: The FreeBSD Foundation
Reported by: https://xenbits.xen.org/xsa/advisory-295.txt

Test Plan

I only booted on x86 and did some sanity checks like running libthr kyua tests.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 25091

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
kib marked 2 inline comments as done.Jun 26 2019, 6:20 PM

Update ppc inline asm comment.

I'm happy with the arm and arm64 changes.

sys/arm64/arm64/support.S
71

Extra newline

This revision is now accepted and ready to land.Jun 26 2019, 6:28 PM
kib marked an inline comment as done.

Remove extra empty line.

This revision now requires review to proceed.Jun 26 2019, 7:03 PM

I ran all of the threaded tests I have, twice. I also did a buildworld / installworld. This on amd64.
I'll run some i386 tests once my test box is available.

share/man/man9/casuword.9
110

Typo in "comparison."

sys/kern/kern_umtx.c
1057

How can we have rv != 1 at this point?

1087

I think this block is dead code now. Previously, casueword32 would return 0 even if the comparison failed. Now it returns 1 in that case, but here we know that rv == 0, so it must be that owner == UMUTEX_UNOWNED. Am I missing something?

kib marked 3 inline comments as done.Jun 27 2019, 6:38 PM
kib added inline comments.
sys/kern/kern_umtx.c
1087

Sure, this is the bug. The block must not be dead, I moved the checks for rv == 1 further. We must handle UMUTEX_CONTESTED there.

kib marked an inline comment as done.

Hopefully fix two bugs.

One pointed out by Mark and related to missed handling of contended mutexes. Another reported by Peter where I restarted inner loop of sem2_wait with unbusied chain, while loop expects it busy.

FWIW I have sent links to this change to OpenBSD, NetBSD and DragonFly BSD developers.

mips64 (with CHERI) boots with this patch applied.

sys/kern/kern_umtx.c
2707

unrelated style change should probably be a separate commit (also the only conflict against CheriBSD)

sys/i386/i386/copyout.c
440

Isn't this inverted? atomic_(f)cmpset returns 1 on success, 0 on failure.

sys/i386/i386/copyout.c
440

Yes it is, it was reported by Peter earlier. I fixed it in my branch, will upload new patch after pho finishes.

sys/kern/kern_umtx.c
1159

Why not just continue?

1169

This condition is always true.

1218

It is guaranteed that old == owner here.

1254

It is guaranteed that old == owner here.

1313

If error == 0 then we must have owner == UMUTEX_CONTESTED, so (owner & ~UMUTEX_CONTESTED) == 0 is always true.

1378

If error == 0 we know that old == owner.

1895

What if rv == 1?

kib marked 9 inline comments as done.Jun 28 2019, 8:45 PM
kib added inline comments.
sys/kern/kern_umtx.c
1313

error == 0 case can happen when count > 1, in which case we did not executed casueword32() at all. And then either owner == UMUTEX_CONTESTED or this is a dying robust mutex.

I think it would be cleaner just to move the part after second '&&' into MPASS().

kib marked an inline comment as done.

Fix i386 casueword.
Handle Mark' notes.
Committed style change.

On i386 I ran a buildworld / installworld + random stress2 tests for a total of 10H30.
On amd64 I ran random stress2 test for 10H30.
No problems seen.

Observations like "this condition is always true" and "it is guaranteed that old == owner here" should ideally be verified by static code analysis. I'd try to apply the patch and run PVS-Studio over it.

Observations like "this condition is always true" and "it is guaranteed that old == owner here" should ideally be verified by static code analysis. I'd try to apply the patch and run PVS-Studio over it.

In theory it can be, but in reality it would not work by itself. You must provide casueword(9) model to the tool, explaining how oldval/*oldvalp/newval and the return values are related (see patched man page). I know that Coverity has a way to augment semantic of the analyzed program this way, but I do not know how. No idea about PVS.

sys/kern/kern_umtx.c
1946

This code is unreachable.

2005

This condition must be true.

2921

The line wraps could be committed separately.

2934

It seems inconsistent that we sleep here.

3116

If oldstate != state and rv != -1, then rv == 1.

3136

Same here.

3344

This test is redundant.

3409

This should be an else if.

kib marked 7 inline comments as done.

Handle reported issues, except the sleep in umtxq_check_susp.
Rebase after style changes were committed.

Original idea of the sleepable vs. non-sleepable umtxq_check_susp() was that existing (pre-patch) calls should be left non-sleepable, while new cases where rv == 1 are sleepable. Apparently this is too strict (wrong) and cases where we loop or own a busy ref on a key should be non-sleepable as well.

I will do the pass over the patch to adjust it according to the stated rules later.

sys/arm64/arm64/support.S
69

Hmm, is the value in w5 overwritten by SET_FAULT_HANDLER?

sys/riscv/riscv/support.S
71

Hmm, a5 seems to have been overwritten twice with temporary addresses (in EXIT_USER_ACCESS and SET_FAULT_HANDLER).

kib marked 2 inline comments as done.

Fix arm64 and riscv asms.
Add a long comment explaining the sleep argument to umtxq_check_susp().
Do pass over the sleep arguments used aligning them with the explanation.

sys/riscv/riscv/support.S
64

Suppose we follow the branch. Aren't we going to return the uninitialized contents of a5?

69

Why not use beqz?

kib marked 2 inline comments as done.

Fix issues with risc-v casueword.

sys/arm64/arm64/support.S
69

I think w5's contents are uninitialized if we follow the branch to label 2.

90

Same here.

sys/kern/kern_umtx.c
695

s/operations/operation/

2150

The key is busied here, so we cannot sleep.

2226–2228

Whitespace looks wrong here.

2227

If rv == 0 we must have owner == UMUTEX_CONTESTED.

kib marked 5 inline comments as done.Jul 6 2019, 4:48 PM
kib added inline comments.
sys/kern/kern_umtx.c
2227

I re-wrote the logic for rv == 1 completely.

kib marked an inline comment as done.

do_lock_pp and arm64 casueword fixes.

This revision is now accepted and ready to land.Jul 8 2019, 11:24 PM

Thank you, Mark.

Peter, could you, please, run the tests one more time, hopefully this is the last.

In D20772#452692, @kib wrote:

Thank you, Mark.

Peter, could you, please, run the tests one more time, hopefully this is the last.

Sure, happy to.
A full test on both amd64 and i386, right?

Bugfixes after Peter' testing.

This revision now requires review to proceed.Jul 12 2019, 12:11 PM
markj added inline comments.
sys/kern/kern_umtx.c
1907

Typo: signal

This revision is now accepted and ready to land.Jul 12 2019, 2:01 PM