Suppose that userspace is executing with the non-standard segment descriptors. Then, until exception or interrupt handler executed SET_KERNEL_SEGS, kernel is still executing with user %ds, %es and %fs. If an interrupt occurs in this window, the interrupt handler is executed unsafely. If interrupt results in the context switch, the contamination of the kernel state spreads to the newly switched thread. As result, kernel data accesses might fault or worse, if only base is changed, completely messed up.
More, if the user segment was allocated in LDT, another thread might mark the descriptor as invalid before doreti code tried to reload them. In this case kernel panics.
The issue exists for all exception entry points which use trap gate, and thus do not automatically disable interrupts on entry, and for lcall_handler.
Fix is two fold: first, we need to disable interrupts for all kernel entries, changing the IDT descriptor types from trap gate to interrupt gate. Interrupts are re-enabled not earlier than the kernel segments are loaded into the segment registers. Second, we only load the segment registers from the trap frame when returning to usermode. For the later, all interrupt return paths must happen through the doreti common code.
There is no way to disable interrupts on call gate, which is the supposed mode of operation for lcall $7,$0 syscalls. Change the LDT entry 0 into code segment type and point it to the userspace trampoline which redirects the syscall to int $0x80.
All the measures make the segment register handling similar to that of amd64. We do not apply amd64 optimizations of not reloading segment registers on return from the syscall.
Reported by: Maxime Villard <max@m00nbsd.net>
Tested by: pho