Details

Reviewers

kib
trasz

Group Reviewers

Linux Emulation

Summary

Previously, the FP CPU context was not preserved when delivering
a signal to Linux processes on amd64. This could cause random
segfaults, especially so since GCC started stashing integer
registers into FPU instead of spilling them to stack.

This fixes random crashes when running Minecraft server 1.15.1
under Focal (openjdk-14-jdk:amd64 14.0.2+12-1~20.04).

Diff Detail

Repository

rS FreeBSD src repository - subversion

Lint

Lint Passed

Unit

No Test Coverage

Build Status

Buildable 44435
Build 41323: arc lint + arc unit

Event Timeline

trasz created this revision.Dec 21 2021, 11:54 AM

Herald added a subscriber: imp. · View Herald TranscriptDec 21 2021, 11:54 AM

trasz requested review of this revision.Dec 21 2021, 11:54 AM

Harbormaster completed remote builds in B43524: Diff 100412.Dec 21 2021, 11:54 AM

trasz added reviewers: kib, Linux Emulation.Dec 21 2021, 11:55 AM

(Also https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=240768)

kib added inline comments.Dec 21 2021, 12:48 PM

sys/amd64/include/md_var.h
37 ↗	(On Diff #100412)	Don't do this.
sys/amd64/linux/linux.h
236	https://github.com/torvalds/linux/blob/6e0567b7305209c2d689ce57180a63d8dc657ad8/arch/x86/kernel/signal.c#L707 This is how signal frame for amd64 is described in the Linux sources comment, and it seems, after, some reading, indeed matching the code. Instead of corrupting l_ucontext definition, please implement it properly. In essence, the key part is the alignment/padding between ucontext and start of the fpu save area (our and Linux' should be compatible).

trasz added inline comments.Dec 21 2021, 2:26 PM

sys/amd64/linux/linux.h
236	The reason I wanted to avoid that is that this essentially requires Linux-specific variant of {get,set}_fpcontext(). Or are you thinking of some other way?

kib added inline comments.Dec 21 2021, 2:58 PM

sys/amd64/linux/linux.h
236	It is not, I believe (I did not do experiments to say it definitely). Their save area is compatible with ours, they just align it.

trasz added inline comments.Dec 27 2021, 6:24 PM

sys/amd64/linux/linux.h
236	So, I've just looked again into this, and I don't think this is practical. We can't copy FPU context from native mcontext_t into l_sigcontext, because there's no room: l_sigcontext only contains a pointer to FPU data (pointer to fpstate), not fpstate itself; looks like Linux simply puts the FPU data on the stack after l_ucontext: https://github.com/torvalds/linux/blob/6e0567b7305209c2d689ce57180a63d8dc657ad8/arch/x86/kernel/signal.c#L192. From what I understand, the whole idea of xsave is that its length is not fixed, so what's the problem with stashing our native mcontext_t there?

Also, for a more general question: why even bother? Does userspace actually access that data, or otherwise make assumptions about it? For arm64 we don't even bother with using linux_specific stack layout, instead we use native FreeBSD mcontext_t and it works fine.

In D33599#760765, @trasz wrote:

Also, for a more general question: why even bother? Does userspace actually access that data, or otherwise make assumptions about it? For arm64 we don't even bother with using linux_specific stack layout, instead we use native FreeBSD mcontext_t and it works fine.

Because userspace does look at that save areas. I think you would get very hard to diagnose issues with runtimes of JVM/Go/any other advanced language.

sys/amd64/linux/linux.h

236

There is so called fixed area in XSAVE area that contains always present x87 FPU registers, and %XMM and related SSE2 state. Linux ucontext provides fixed locations for these registers:

typedef struct ucontext_t
  {
    unsigned long int __ctx(uc_flags);
    struct ucontext_t *uc_link;
    stack_t uc_stack;
    mcontext_t uc_mcontext;
    sigset_t uc_sigmask;
    struct _libc_fpstate __fpregs_mem;
    __extension__ unsigned long long int __ssp[4];
  } ucontext_t;

And

typedef struct
  {
    gregset_t __ctx(gregs);
    /* Note that fpregs is a pointer.  */
    fpregset_t __ctx(fpregs);
    __extension__ unsigned long long __reserved1 [8];
} mcontext_t;

struct _libc_fpstate
{
  /* 64-bit FXSAVE format.  */
  __uint16_t            __ctx(cwd);
  __uint16_t            __ctx(swd);
  __uint16_t            __ctx(ftw);
  __uint16_t            __ctx(fop);
  __uint64_t            __ctx(rip);
  __uint64_t            __ctx(rdp);
  __uint32_t            __ctx(mxcsr);
  __uint32_t            __ctx(mxcr_mask);
  struct _libc_fpxreg   _st[8];
  struct _libc_xmmreg   _xmm[16];
  __uint32_t            __glibc_reserved1[24];
};

I just looked into the filled context: uc->uc_mcontext.fpregs points to uc->__fpregs_mem.mxcsr.

_libc_fpstate.st[] are x87 %st registers. _libc_fpstate._xmm are %xmm registers. Basically you need to copy the same names from our fpusave area to linux' one.

This is ignoring XSAVE, i.e. AVX and larger extensiones. Lets do the FXSAVE right first.

Get rid of embedded mcontext_t; instead put FP state into Linux sigcontext_t.
Work in progress, as this still uses old method of handling XSAVE.

Harbormaster completed remote builds in B43633: Diff 100680.Dec 28 2021, 11:39 PM

trasz marked an inline comment as done.Dec 28 2021, 11:39 PM

trasz edited the summary of this revision. (Show Details)

I suggest you to remove XSAVE handling. Doing it as a followup would be simpler and cleaner.

sys/amd64/linux/linux_sysvec.c
713	Move all these CTASSERTs() to the top level. And perhaps use _Static_assert().

Axe xsave for now, only call set_fpcontext if we have fpcontext.

Harbormaster completed remote builds in B43636: Diff 100705.Dec 29 2021, 3:24 PM

trasz added inline comments.Dec 29 2021, 3:26 PM

sys/amd64/linux/linux_sysvec.c
713	How strongly you want it? Personally I'd prefer those like they are now; when you look at the memcpy the question is "how do we know those are of equal size, and CTASSERsT just above answers that. It also makes it possible to use variable names instead of types. Also, _Static_assert requires a text message, and that doesn't really add anything here except clutter.

Globalization of get/set_mcontext() should be committed separately.

Overall it seems to be good.

sys/amd64/linux/linux_sysvec.c
713	I prefer such asserts around definitions, and not scattered randomly in a code that uses the type.
728	This is unrelated, please commit separately.

trasz marked an inline comment as done.Jan 4 2022, 2:46 PM

trasz added inline comments.

sys/amd64/linux/linux_sysvec.c
713	Normally I prefer that too, but I feel this case is different: it's not that any of those two structures _must_ be of that size; it's just that this particular piece of code is making that assumption.

Regen after committing unrelated parts.

Harbormaster completed remote builds in B43695: Diff 100924.Jan 4 2022, 2:47 PM

kib added inline comments.Jan 4 2022, 11:51 PM

sys/amd64/linux/linux_sysvec.c
713	I do not agree, these are arrays of the register' content.
717	Space before '='. The comment is right, the code is wrong.

trasz added inline comments.Feb 14 2022, 1:32 PM

sys/amd64/linux/linux_sysvec.c
713	Okay, let's say those assertions are not here and you see this code - in particular this memcpy() - for the first time. Assertion at the top, assuming you find it, won't detect that I got the types mixed up, and that's probably the number one thing I want this assertion for: there's no point in asserting that a 32-element array indeed is 32-elements long; I want to assert that _this_ particular array - identified by its name, not type - is what I expect it to be.

Fix sc_fpstate.

Harbormaster completed remote builds in B44435: Diff 102743.Feb 14 2022, 1:33 PM

kib added inline comments.Feb 14 2022, 8:39 PM

sys/amd64/linux/linux_sysvec.c
713	Static asserts only state something about data layout, they are not about code being executed.

trasz added inline comments.Feb 15 2022, 12:41 PM

sys/amd64/linux/linux_sysvec.c
713	In this case they are about the code being read by human. They say "this memcpy is ok, because both sizes match". Without them, that's not clear.

done

linux(4): Preserve floating-point on signal delivery on amd64
AbandonedPublic
Actions

Details

Diff Detail

Event Timeline

Revision Contents
Changeset List

Diff 102743

sys/amd64/linux/linux.h

sys/amd64/linux/linux_sysvec.c

linux(4): Preserve floating-point on signal delivery on amd64AbandonedPublicActions

Details

Diff Detail

Event Timeline

Revision ContentsChangeset List

Diff 102743

sys/amd64/linux/linux.h

sys/amd64/linux/linux_sysvec.c

linux(4): Preserve floating-point on signal delivery on amd64
AbandonedPublic
Actions

Revision Contents
Changeset List