Page MenuHomeFreeBSD

linux(4): Preserve floating-point on signal delivery on amd64
AbandonedPublic

Authored by dchagin on Dec 21 2021, 11:54 AM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Apr 12, 9:00 AM
Unknown Object (File)
Dec 25 2023, 7:02 AM
Unknown Object (File)
Dec 23 2023, 12:43 AM
Unknown Object (File)
Dec 22 2023, 10:54 PM
Unknown Object (File)
Dec 17 2023, 6:17 PM
Unknown Object (File)
Dec 13 2023, 4:15 PM
Unknown Object (File)
Nov 20 2023, 8:31 PM
Unknown Object (File)
Nov 6 2023, 4:39 AM
Subscribers

Details

Reviewers
kib
trasz
Group Reviewers
Linux Emulation
Summary

Previously, the FP CPU context was not preserved when delivering
a signal to Linux processes on amd64. This could cause random
segfaults, especially so since GCC started stashing integer
registers into FPU instead of spilling them to stack.

This fixes random crashes when running Minecraft server 1.15.1
under Focal (openjdk-14-jdk:amd64 14.0.2+12-1~20.04).

Sponsored By: EPSRC

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 44435
Build 41323: arc lint + arc unit

Event Timeline

sys/amd64/include/md_var.h
37 ↗(On Diff #100412)

Don't do this.

sys/amd64/linux/linux.h
236

https://github.com/torvalds/linux/blob/6e0567b7305209c2d689ce57180a63d8dc657ad8/arch/x86/kernel/signal.c#L707

This is how signal frame for amd64 is described in the Linux sources comment, and it seems, after, some reading, indeed matching the code. Instead of corrupting l_ucontext definition, please implement it properly. In essence, the key part is the alignment/padding between ucontext and start of the fpu save area (our and Linux' should be compatible).

sys/amd64/linux/linux.h
236

The reason I wanted to avoid that is that this essentially requires Linux-specific variant of {get,set}_fpcontext(). Or are you thinking of some other way?

sys/amd64/linux/linux.h
236

It is not, I believe (I did not do experiments to say it definitely). Their save area is compatible with ours, they just align it.

sys/amd64/linux/linux.h
236

So, I've just looked again into this, and I don't think this is practical. We can't copy FPU context from native mcontext_t into l_sigcontext, because there's no room: l_sigcontext only contains a pointer to FPU data (pointer to fpstate), not fpstate itself; looks like Linux simply puts the FPU data on the stack after l_ucontext: https://github.com/torvalds/linux/blob/6e0567b7305209c2d689ce57180a63d8dc657ad8/arch/x86/kernel/signal.c#L192.

From what I understand, the whole idea of xsave is that its length is not fixed, so what's the problem with stashing our native mcontext_t there?

Also, for a more general question: why even bother? Does userspace actually access that data, or otherwise make assumptions about it? For arm64 we don't even bother with using linux_specific stack layout, instead we use native FreeBSD mcontext_t and it works fine.

Also, for a more general question: why even bother? Does userspace actually access that data, or otherwise make assumptions about it? For arm64 we don't even bother with using linux_specific stack layout, instead we use native FreeBSD mcontext_t and it works fine.

Because userspace does look at that save areas. I think you would get very hard to diagnose issues with runtimes of JVM/Go/any other advanced language.

sys/amd64/linux/linux.h
236

There is so called fixed area in XSAVE area that contains always present x87 FPU registers, and %XMM and related SSE2 state. Linux ucontext provides fixed locations for these registers:

typedef struct ucontext_t
  {
    unsigned long int __ctx(uc_flags);
    struct ucontext_t *uc_link;
    stack_t uc_stack;
    mcontext_t uc_mcontext;
    sigset_t uc_sigmask;
    struct _libc_fpstate __fpregs_mem;
    __extension__ unsigned long long int __ssp[4];
  } ucontext_t;

And

typedef struct
  {
    gregset_t __ctx(gregs);
    /* Note that fpregs is a pointer.  */
    fpregset_t __ctx(fpregs);
    __extension__ unsigned long long __reserved1 [8];
} mcontext_t;
struct _libc_fpstate
{
  /* 64-bit FXSAVE format.  */
  __uint16_t            __ctx(cwd);
  __uint16_t            __ctx(swd);
  __uint16_t            __ctx(ftw);
  __uint16_t            __ctx(fop);
  __uint64_t            __ctx(rip);
  __uint64_t            __ctx(rdp);
  __uint32_t            __ctx(mxcsr);
  __uint32_t            __ctx(mxcr_mask);
  struct _libc_fpxreg   _st[8];
  struct _libc_xmmreg   _xmm[16];
  __uint32_t            __glibc_reserved1[24];
};

I just looked into the filled context: uc->uc_mcontext.fpregs points to uc->__fpregs_mem.mxcsr.

_libc_fpstate.st[] are x87 %st registers. _libc_fpstate._xmm are %xmm registers. Basically you need to copy the same names from our fpusave area to linux' one.

This is ignoring XSAVE, i.e. AVX and larger extensiones. Lets do the FXSAVE right first.

Get rid of embedded mcontext_t; instead put FP state into Linux sigcontext_t.
Work in progress, as this still uses old method of handling XSAVE.

trasz edited the summary of this revision. (Show Details)

I suggest you to remove XSAVE handling. Doing it as a followup would be simpler and cleaner.

sys/amd64/linux/linux_sysvec.c
713

Move all these CTASSERTs() to the top level. And perhaps use _Static_assert().

Axe xsave for now, only call set_fpcontext if we have fpcontext.

sys/amd64/linux/linux_sysvec.c
713

How strongly you want it? Personally I'd prefer those like they are now; when you look at the memcpy the question is "how do we know those are of equal size, and CTASSERsT just above answers that. It also makes it possible to use variable names instead of types. Also, _Static_assert requires a text message, and that doesn't really add anything here except clutter.

Globalization of get/set_mcontext() should be committed separately.

Overall it seems to be good.

sys/amd64/linux/linux_sysvec.c
713

I prefer such asserts around definitions, and not scattered randomly in a code that uses the type.

728

This is unrelated, please commit separately.

trasz marked an inline comment as done.Jan 4 2022, 2:46 PM
trasz added inline comments.
sys/amd64/linux/linux_sysvec.c
713

Normally I prefer that too, but I feel this case is different: it's not that any of those two structures _must_ be of that size; it's just that this particular piece of code is making that assumption.

Regen after committing unrelated parts.

sys/amd64/linux/linux_sysvec.c
713

I do not agree, these are arrays of the register' content.

717

Space before '='.
The comment is right, the code is wrong.

sys/amd64/linux/linux_sysvec.c
713

Okay, let's say those assertions are not here and you see this code - in particular this memcpy() - for the first time. Assertion at the top, assuming you find it, won't detect that I got the types mixed up, and that's probably the number one thing I want this assertion for: there's no point in asserting that a 32-element array indeed is 32-elements long; I want to assert that _this_ particular array - identified by its name, not type - is what I expect it to be.

sys/amd64/linux/linux_sysvec.c
713

Static asserts only state something about data layout, they are not about code being executed.

sys/amd64/linux/linux_sysvec.c
713

In this case they are about the code being read by human. They say "this memcpy is ok, because both sizes match". Without them, that's not clear.

dchagin abandoned this revision.
dchagin added a reviewer: trasz.

done