Page MenuHomeFreeBSD

amd64: Only update fsbase/gsbase in pcb for curthread.
ClosedPublic

Authored by jhb on Mar 3 2021, 12:12 AM.

Details

Summary

Before the pcb is copied to the new thread during cpu_fork() and
cpu_copy_thread(), the kernel re-reads the current register values in
case they are stale. This is done by setting PCB_FULL_IRET in
pcb_flags.

This works fine for user threads, but the creation of kernel processes
and kernel threads do not follow the normal synchronization rules for
pcb_flags. Specifically, new kernel processes are always forked from
thread0, not from curthread, so adjusting pcb_flags via a simple
instruction without the LOCK prefix can race with thread0 running on
another CPU. Similarly, kthread_add() clones from the first thread in
the relevant kernel process, not from curthread. In practice, Netflix
encountered a panic where the pcb_flags in the first kthread of the
KTLS process were trashed due to update_pcb_bases() in
cpu_copy_thread() running from thread0 to create one of the other KTLS
threads racing with the first KTLS kthread calling fpu_kern_thread()
on another CPU. In the panicking case, the write to update pcb_flags
in fpu_kern_thread() was lost triggering an "Unregistered use of FPU
in kernel" panic when the first KTLS kthread later tried to use the
FPU.

Diff Detail

Repository
R10 FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

jhb requested review of this revision.Mar 3 2021, 12:12 AM

I haven't yet tested this series of patches beyond verifying that it compiles.

sys/amd64/amd64/vm_machdep.c
169–170

i386 has a similar check here before calling rgs().

Should we explicitly set fsbase/gsbase for remote threads to some sane value, or assert that td1 != curthread always implies that td1 is kernel thread?

sys/amd64/amd64/vm_machdep.c
168

This call to fpuexit() also does nothing useful is td1 is a remote thread.

If we added the assert we should by rights do it in all the cpu_fork/cpu_copy_thread impls.

The other more extreme option we could take is to not hardcode the source thread for these things but instead always use curthread, but assert that curthread is a kernel thread (TDP_KTHREAD) in kproc_create() and kthread_add(). This would remove the need for the various curthread checks.

sys/amd64/amd64/vm_machdep.c
168

This call to fpuexit() also does nothing useful is td1 is a remote thread.

Yes, it does the curthread check internally. We could move the check out into this caller which might be cleaner if it's the only place that matters.

In D29023#650449, @jhb wrote:

If we added the assert we should by rights do it in all the cpu_fork/cpu_copy_thread impls.

Perhaps ok.

The other more extreme option we could take is to not hardcode the source thread for these things but instead always use curthread, but assert that curthread is a kernel thread (TDP_KTHREAD) in kproc_create() and kthread_add(). This would remove the need for the various curthread checks.

Would this work? For instance, when we kldload a driver which wants to spawn a kernel process or thread.

sys/amd64/amd64/vm_machdep.c
168

I mean that if td1 != curthread then both fpuexit() does nothing and potentially we do not get the up to date FPU state.

In D29023#650477, @kib wrote:
In D29023#650449, @jhb wrote:

The other more extreme option we could take is to not hardcode the source thread for these things but instead always use curthread, but assert that curthread is a kernel thread (TDP_KTHREAD) in kproc_create() and kthread_add(). This would remove the need for the various curthread checks.

Would this work? For instance, when we kldload a driver which wants to spawn a kernel process or thread.

Yeah, it wouldn't. I had the same thought yesterday that some drivers create task queue threads from module handlers and those run in the context of a user thread.

sys/amd64/amd64/vm_machdep.c
168

Yes. Arguably I think kthreads/kprocs should really start with a clean slate regardless. Perhaps what we should be doing (assuming TDP_KTHREAD is set before cpu_fork() or cpu_thread_copy() is called) is something like:

if (td->td_pflags & TDP_KTHREAD) {
    /* set clean state for FPU and seg bases. */
} else {
    MPASS(td1 == curthread);
    /* inherit from parent */
}

I think I would probably do that as a followup to this specific fix though since it would have to be across all the various architectures.

sys/amd64/amd64/vm_machdep.c
168

I believe that the snipped you pasted above is the right approach.

sys/amd64/amd64/vm_machdep.c
168

I believe that the snipped you pasted above is the right approach.

Can I merge this commit first at least and do the larger fixup as a followup as Netflix is seeing a panic on a non-trivial number of their machines due to this?

sys/amd64/amd64/vm_machdep.c
168

Of course.

This revision was not accepted when it landed; it landed in state Needs Review.Mar 12 2021, 5:49 PM
This revision was automatically updated to reflect the committed changes.