Page MenuHomeFreeBSD

[POWERPC] Handles float point exception
ClosedPublic

Authored by alfredo on Feb 11 2020, 6:33 PM.
Referenced Files
F107281863: D23623.id79240.diff
Sat, Jan 11, 10:47 PM
Unknown Object (File)
Wed, Dec 25, 7:50 PM
Unknown Object (File)
Wed, Dec 25, 9:20 AM
Unknown Object (File)
Wed, Dec 25, 9:19 AM
Unknown Object (File)
Sat, Dec 21, 7:14 AM
Unknown Object (File)
Fri, Dec 13, 11:48 PM
Unknown Object (File)
Fri, Dec 13, 8:47 PM
Unknown Object (File)
Dec 11 2024, 3:41 AM

Details

Summary

Enables FPE on ppc by default. Note: user application still needs to set, with feenableexcept() , which exception will trigger a SIGFPE.

Test Plan

On both 32 and 64 bits environments, the last lines of test program bellow should be like:

Handler called 5 times. Caught signal 8
si_signo 8
si_code 3
si_addr 0x0x100110a0
Caught SIGFPE due to FPE_FLTDIV, as expected
Clearing exceptions
## handler called too many times, disabling FP exceptions ##
y = inf
Exited normally.

Test code:

#include <fenv.h>
#include <signal.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>

#define CNT_MAX 5
int cnt = 0;

void handler(int sig, siginfo_t *info, void *uap)
{
	cnt++;

        printf("Handler called %d times. Caught signal %d\n", cnt, sig);
        printf("si_signo %d\n", info->si_signo);
        printf("si_code %d\n", info->si_code);
        printf("si_addr 0x%p\n", info->si_addr);

        if (info->si_signo == SIGFPE  &&
                        info->si_code == FPE_FLTDIV) {
                printf("Caught SIGFPE due to FPE_FLTDIV, as expected\n");
        } else
                printf("Caught unexpected signal.\n");

	printf("Clearing exceptions\n");
	feclearexcept(FE_ALL_EXCEPT);

	if (cnt >= CNT_MAX) {
		printf("## handler called too many times, disabling FP exceptions ##\n");
		fedisableexcept(FE_INVALID | FE_DIVBYZERO | FE_OVERFLOW  | FE_UNDERFLOW);
		if (cnt > CNT_MAX + 5) {
			printf("ERROR!! Didn't stop triggering exceptions!\n");
			exit(1);
		}
	}


}

int main(void)
{
        struct sigaction sa;

        sa.sa_sigaction = handler;

        sigemptyset(&sa.sa_mask);
        sa.sa_flags = SA_SIGINFO; /* Restart functions if
                                 interrupted by handler */

        if (sigaction(SIGFPE, &sa, NULL) == -1)
        {
                printf("error\n");
        }

        int mask = feenableexcept(FE_INVALID   |
                   FE_DIVBYZERO |
                   FE_OVERFLOW  |
                   FE_UNDERFLOW);
        printf("mask: %x\n", mask);

        volatile double x = 0.0;
        volatile double y = 1.0 / x;
        printf("y = %f\n", y);
	if (cnt == CNT_MAX) {
		printf("Exited normally.\n");
	}
	else {
		printf("ERROR!!!!\n");
	}
        return y;
}

Diff Detail

Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 29468
Build 27342: arc lint + arc unit

Event Timeline

sys/powerpc/include/psl.h
49

This belongs in trap.hh, as EXC_PGM_FPE, or similar name, since it's part of SRR1, not MSR.

Actually, I think this really should belong in spr.h, as SRR1_PGM_FPE, and the other EXC_PGM* #defines moved there, too. But that's not how it's done now, so just put it in trap.h.

Looks good.
I just have some style related comments and 2 questions about the changes.

sys/powerpc/include/psl.h
91

Do we want to enable FP exceptions by default?

sys/powerpc/powerpc/trap.c
400โ€“424

This brace should be in the line above.
See style(9).

406

Variable declarations should be all in the beginning of a block, separated of other statements by a new line, and function call initializers should not be used (e.g. mfmsr()).
See style(9).

426

I may be missing something, but why is enable_fpu() needed here?

renato.riolino_eldorado.org.br added inline comments.
sys/powerpc/include/psl.h
91

It's still not "enabled". By default none of the exception enabled registers are enable. User must call feenableexcept() if he wants to receive a FP exception.

sys/powerpc/powerpc/trap.c
426

enable_fpu() loads the floating point registers from the PCB. Without it the application can't capture the SIGFPE.

Fixes pointed by jhibbits and luporl and a test of endianess when checking the type of fp exception.

sys/powerpc/powerpc/trap.c
402โ€“413

I think this whole union could be replaced by an uint64_t.
The only part where you use it as a double is in the inline assembly, which should work fine with uint64_t too.
Then you wouldn't need to worry about the endianness too.
(BTW, currently all kernel PPC code assume BE, but starting to take LE into account is nice).

Using a uint64_t instead of a union as suggested by luporl

sys/powerpc/powerpc/trap.c
413

This block won't work as you think. FPE_FLTDIV, et al, are integers, not bitfields. Only one reason is allowed.

sys/powerpc/powerpc/trap.c
424

Is there a paired save_fpu() anywhere? Otherwise we're overwriting the FPRs with whatever was last saved, which could be nothing.

renato.riolino_eldorado.org.br marked 2 inline comments as done.

Treating fpe exceptions as integers instead of bitfields and calling save_fpu() when detected the signal as been FPE.

Fixed order for calling save_fpu() and enable_fpu()

sys/powerpc/powerpc/trap.c
400โ€“424

I don't think this is quite right. We need to be able to pass up SIGFPE on emulated FPU as well.

renato.riolino_eldorado.org.br marked an inline comment as done.

Passing SIGFPE on emulated FPU as well.

sys/powerpc/powerpc/trap.c
404โ€“408

This will panic with an illegal instruction on cores with no FPU. Isn't the FPSCR already available?

I think this should also probably go into a FPU-specific header (machine/fpu.h, maybe even powerpc/fpu.c for implementation), so that it's easy to also use other FPU interfaces, such as the SPE for powerpcspe arch.

Moved exception type check from trap.c to fpu.c

sys/powerpc/powerpc/exec_machdep.c
1165

Shouldn't this be a save_fpu() instead?

sys/powerpc/powerpc/fpu.c
227 โ†—(On Diff #69111)

I think there's a misunderstanding here:

  • We need to support SIGFPE for both emulated and physical FPU. This check only handles physical.
  • 'sig' in exec_machdep.c is set by fpu_emulate(), otherwise it's set directly by ppc_instr_emulate(), and the FPU registers are already updated. Therefore
  • We already have in the PCB the fpscr from above. There's no need to extract it again here.

I'm going to continue with revision and tests. Thank you @renato.riolino_eldorado.org.br for the work investigating it and providing the patch!

alfredo marked 2 inline comments as done.
alfredo edited the test plan for this revision. (Show Details)

Updated after jhibbits comments.
I don't think this patch is ready. Test shows signal being delivered to application, but I cannot catch/handle it. Need more tests.

alfredo edited the test plan for this revision. (Show Details)

code refactory (still in progress)

alfredo added a subscriber: afscoelho_gmail.com.

Code refactored, FPE exception is now identified without instruction emulate
Cleared ssr1 bits FE0 and FE1 makes user signal handler work instead of trap with SIGILL , but I don't think it's definitive solution.

system instability doesn't occur on bare metal (confirmed by @afscoelho_gmail.com ). It might point to a bug in QEMU msr handling

I can tell you off the top of my head: Yes, qemu MSR handling *is* busted (at least in TCG) -- mtmsr(d) in qemu does not do anything remotely close to what the ISA describes it as. It just immediately whacks the entire value into the MSR, no bit masking no nothing.

I encountered this myself working on the LE stuff. As you can imagine, the endianness suddenly changing in between instructions because I was relying on some of the bits actually being ignored was... problematic to say the least.

I had to work around this on my end with hacks like https://github.com/freebsd/freebsd/commit/4a8fe4362a432ba8ddeb0f13148fe144a26e7b2a -- if there are any MSR bits you rely on that are only supposed to be updated during rfi(d), you will need to ensure not to change them prematurely with qemu.

alfredo edited the test plan for this revision. (Show Details)

Fix "random process" crash due to leaked FPSCR bits, code cleanup and updated test plan example.

Is there any more work needed on this?

Added cleanup of FPSCR after saving fpu state

Is there any more work needed on this?

I did one more change: in trap.c moved cleanup_fpscr() that was after get_fpu_exception() to save_fpu(), so kernel will continue with a clean FPSCR after this point.
Final behavior look the same, but any input is welcome.

alfredo edited the test plan for this revision. (Show Details)

It should be good now. User is able to change floating point stat inside handler if wish.
Updated test program to exercise trap reocorring a few times before disabling exceptions inside handler and letting program to finish "gracefully".

alfredo edited the test plan for this revision. (Show Details)

Add support for signal handling on 32 bits

This revision is now accepted and ready to land.Nov 6 2020, 5:13 AM
This revision was automatically updated to reflect the committed changes.