Page MenuHomeFreeBSD

amd64: Fix propagation of LDT updates
AcceptedPublic

Authored by markj on Mon, Jun 7, 8:32 PM.

Details

Reviewers
kib
Summary

A non-default LDT is inherited by child processes when RFMEM is
specified, on both i386 and amd64. For i386, commit 05dfa22fe94 fixed a
problem where LDT updates were not propagated to other sharers. This
appears to have been copied in the amd64 implementation.

On amd64 this logic has some problems:


- A process may set the LDT after a RFMEM child was already created.
In this case we only set the mdproc fields for the parent, but the
child may be forced to reload the LDT as well.
- A RFMEM child may set the LDT. In this case I believe the parent
should be left unmodified.
- On amd64 the LDT is only set once. Therefore there is no need to
propagate changes beyond the current process: a child has either
already inherited the LDT, or was forked before the LDT was set.

Fix the problem by only updating running threads in the calling process.
I believe the logic on i386 is also not quite right for similar reasons,
but there the third point does not apply.

Reported by: syzkaller

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint OK
Unit
No Unit Test Coverage
Build Status
Buildable 39804
Build 36693: arc lint + arc unit

Event Timeline

markj requested review of this revision.Mon, Jun 7, 8:32 PM

So what specifically the problem is? That the process might be forced to execute set_user_ldt() while md_ldt is actually still being set up?

I believe that the initial model _was supposed_ to be that LDT pairs with the vmspace and not process. This (was) esp. important for Linuxthreads, where several processes shared address space but were represented as threads from the userspace library. And this is confirmed by the still present code to reference parent LDT into child on cpu_fork.

So might be, the better solution is to check for p_md->md_ldt != NULL before calling set_user_ldt? The pointer cannot change back for the running process.

In D30685#689430, @kib wrote:

So what specifically the problem is? That the process might be forced to execute set_user_ldt() while md_ldt is actually still being set up?

Sorry for not explaining further. The problem is that set_user_ldt() may populate the LDT entry of the GDT with zeroes, and the subsequent lldt instruction raises #GP because the segment descriptor is not valid (I think because the descriptor type is wrong).

I believe that the initial model _was supposed_ to be that LDT pairs with the vmspace and not process. This (was) esp. important for Linuxthreads, where several processes shared address space but were represented as threads from the userspace library. And this is confirmed by the still present code to reference parent LDT into child on cpu_fork.

So does this mean that any change to the LDT must be shared among all sharing processes, regardless of their parent/child relationship?

So might be, the better solution is to check for p_md->md_ldt != NULL before calling set_user_ldt? The pointer cannot change back for the running process.

This is not really equivalent in the (very contrived) case where process A forks process B, both sharing an address space, and both processes set an LDT independently. I'm sure this is not important, but it seems strange to permit one process to force an LDT reload in another when the LDT is not in fact shared.

We could perhaps require that orig->p_md_md_ldt == target->p_md->md_ldt in the rendezvous operation. I suspect that this is the solution for i386.

In D30685#689430, @kib wrote:

So what specifically the problem is? That the process might be forced to execute set_user_ldt() while md_ldt is actually still being set up?

Sorry for not explaining further. The problem is that set_user_ldt() may populate the LDT entry of the GDT with zeroes, and the subsequent lldt instruction raises #GP because the segment descriptor is not valid (I think because the descriptor type is wrong).

Ah, md_ldt_sd is zero, right. So yes I think the solution is to read md_ldt and only do something if md_ldt != NULL. But the read of md_ldt should have acq semantic to pair with fence_rel() in user_ldt_alloc().

I believe that the initial model _was supposed_ to be that LDT pairs with the vmspace and not process. This (was) esp. important for Linuxthreads, where several processes shared address space but were represented as threads from the userspace library. And this is confirmed by the still present code to reference parent LDT into child on cpu_fork.

So does this mean that any change to the LDT must be shared among all sharing processes, regardless of their parent/child relationship?

In principle yes, but practically there was a leader that sets up LDT descriptor for TLS (or whatever, I never looked further) and which then rforks to create threads. So it is enough that the initial set up of LDT is visible, for practical case of rfork(RFMEM)+LDT. Having it working up as opposed to the down case I described, is probably not required and would cause too much complications.

So might be, the better solution is to check for p_md->md_ldt != NULL before calling set_user_ldt? The pointer cannot change back for the running process.

This is not really equivalent in the (very contrived) case where process A forks process B, both sharing an address space, and both processes set an LDT independently. I'm sure this is not important, but it seems strange to permit one process to force an LDT reload in another when the LDT is not in fact shared.

We could perhaps require that orig->p_md_md_ldt == target->p_md->md_ldt in the rendezvous operation. I suspect that this is the solution for i386.

What would be orig, the remote process that initiated the rendezvous for LDT reload? I think this is fine, if redundant. The excess reload should be innocent, after the check for md_ldt is added.

In D30685#689474, @kib wrote:
In D30685#689430, @kib wrote:

So what specifically the problem is? That the process might be forced to execute set_user_ldt() while md_ldt is actually still being set up?

Sorry for not explaining further. The problem is that set_user_ldt() may populate the LDT entry of the GDT with zeroes, and the subsequent lldt instruction raises #GP because the segment descriptor is not valid (I think because the descriptor type is wrong).

Ah, md_ldt_sd is zero, right. So yes I think the solution is to read md_ldt and only do something if md_ldt != NULL. But the read of md_ldt should have acq semantic to pair with fence_rel() in user_ldt_alloc().

Why does it use fence_rel() instead of a release store?

I believe that the initial model _was supposed_ to be that LDT pairs with the vmspace and not process. This (was) esp. important for Linuxthreads, where several processes shared address space but were represented as threads from the userspace library. And this is confirmed by the still present code to reference parent LDT into child on cpu_fork.

So does this mean that any change to the LDT must be shared among all sharing processes, regardless of their parent/child relationship?

In principle yes, but practically there was a leader that sets up LDT descriptor for TLS (or whatever, I never looked further) and which then rforks to create threads. So it is enough that the initial set up of LDT is visible, for practical case of rfork(RFMEM)+LDT. Having it working up as opposed to the down case I described, is probably not required and would cause too much complications.

So might be, the better solution is to check for p_md->md_ldt != NULL before calling set_user_ldt? The pointer cannot change back for the running process.

This is not really equivalent in the (very contrived) case where process A forks process B, both sharing an address space, and both processes set an LDT independently. I'm sure this is not important, but it seems strange to permit one process to force an LDT reload in another when the LDT is not in fact shared.

We could perhaps require that orig->p_md_md_ldt == target->p_md->md_ldt in the rendezvous operation. I suspect that this is the solution for i386.

What would be orig, the remote process that initiated the rendezvous for LDT reload? I think this is fine, if redundant. The excess reload should be innocent, after the check for md_ldt is added.

Right. I think it is slightly preferable to be precise there even if it's harmless in practice.

  • Compare initiator and target LDTs, and only reload if they match.
  • Use an acquire load in the comparison.

Why does it use fence_rel() instead of a release store?

I do not see a strong reason. Might be it allows for less casts. Or because it was added right after thread fences were added, so they were hot stuff, or even before _rel on amd64 become LOCK-less. Anyway I do not think it making any difference on amd64.

sys/amd64/amd64/sys_machdep.c
503

I would put a note that this acq sync/w rel in user_ldt_alloc().

This revision is now accepted and ready to land.Tue, Jun 8, 3:39 PM