Page MenuHomeFreeBSD

Fix for mis-interpretation of PCB_KERNFPU.
ClosedPublic

Authored by kib on Jun 29 2020, 10:18 PM.
Tags
None
Referenced Files
F108543570: D25511.id73888.diff
Sun, Jan 26, 3:38 AM
Unknown Object (File)
Fri, Jan 17, 1:33 PM
Unknown Object (File)
Dec 24 2024, 7:36 PM
Unknown Object (File)
Dec 11 2024, 5:03 PM
Unknown Object (File)
Nov 3 2024, 12:33 PM
Unknown Object (File)
Sep 28 2024, 5:12 AM
Unknown Object (File)
Sep 7 2024, 8:22 PM
Unknown Object (File)
Sep 5 2024, 7:04 PM
Subscribers

Details

Summary

RIght now PCB_KERNFPU is used both as indication that kernel prepared hardware FPU context to use and that the thread is fpu-kern thread. This also breaks fpu_kern_enter(FPU_KERN_NOCTX), since fpu_kern_leave() then clears PCB_KERNFPU.

Introduce new flag PCB_KERNFPU_THR which indicates that the thread is fpu-kern. Do not clear PCB_KERNFPU if fpu-kern thread leaves noctx fpu region.

Reported by: jhb

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

kib requested review of this revision.Jun 29 2020, 10:18 PM

Do we need a similar change in arm64 and i386?

Do we need a similar change in arm64 and i386?

For arm64 yes. For i386 there is no NOCTX.
But lets first confirm that it fixes reported issue without new regressions.

I was finally testing this today but couldn't remember which assertion I tripped over before or how to trigger it again. I believe I just had the fpu_kern_enter() unconditional, but I can't find something that panics and dies when the software crypto threads lose the PCB_KERNFPU flag.

sys/amd64/amd64/fpu.c
1131 ↗(On Diff #73888)

Alternatively I guess you could just add this here:

if (fpu_is_kern_thread(0)) {
    critcal_exit();
    return (0);
}

I'm not sure which is clearer. It's not really clear to me why we really have the 'KTHR' flag vs having fpu_kern_enter() with a valid ctx always assume the KTHR flag.

I believe it was fpu_kern_enter()/leave() pair in fpu_kern_thread that caused the grief, indeed.

sys/amd64/amd64/fpu.c
1131 ↗(On Diff #73888)

Even for fpu_kern_thread() I maintain the nesting of fpu save area. I.e. you can enter in fpu_kern_thread() and use a new allocated ctx. Similarly, PCB_FPUNOSAVE even for fpu_kern_thread() implies that code promised to not contaminate current save area.

I did figure out a way to test this. Netflix has some software crypto kernel modules that assume they run in an fpu_kern_thread thread context so they don't use fpu_kern_enter. Running one of those after using a driver that used FPU_KERN_NOCTX and thus cleared PCB_KERNFPU was enough to provoke a fpudna panic. This patch does fix the panic.

This revision was not accepted when it landed; it landed in state Needs Review.Oct 14 2020, 11:01 PM
This revision was automatically updated to reflect the committed changes.