Page MenuHomeFreeBSD

riscv cpu_fork(), saving fpe state
ClosedPublic

Authored by jsihv_gmx.com on Nov 18 2025, 3:18 PM.
Tags
Referenced Files
Unknown Object (File)
Wed, Mar 18, 1:21 PM
Unknown Object (File)
Tue, Mar 17, 9:48 PM
Unknown Object (File)
Tue, Mar 17, 10:42 AM
Unknown Object (File)
Mon, Mar 9, 3:41 AM
Unknown Object (File)
Sat, Mar 7, 5:45 AM
Unknown Object (File)
Mon, Mar 2, 2:58 PM
Unknown Object (File)
Mon, Mar 2, 9:13 AM
Unknown Object (File)
Sun, Mar 1, 9:12 PM
Subscribers

Details

Summary

This is an attempt to replace TODO in riscv's cpu_fork(). The code updates the floating point state before it proceeds copying the pcb struct.

pcb_fp_started flag is used as a condition as in a corresponding arm64 code and eg. in riscv's exec_machdep.c:fill_fpregs(). It indicates that fp is in use. I have verified with a debug code that when PCB_FP_STARTED is off, pcb_x registers are always zero.

A corresponding function in arm64 has a condition (td1 == curthread) for saving the fp state (and amd64 has it in MPASS).
arm64 added this condition here: https://github.com/freebsd/freebsd-src/commit/2db317ca8577a0b208ecae38ecfa4a8a690a0fa8
However, so far I have not found out in which kind of situation td1 might not be curthread. I've put a such condition (td1 != curthread) to my debug code in order to see if this condition is ever true but after booting and making some forking as a user, the condition has not ever been true. I also asked about it on our IRC channel and nobody seemed to know what's the meaning of this condition. So I have not included this condition here.

critical_enter() and critical_exit() are used here similarly as arm64 and amd64 use it in corresponding places to prevent a thread being preempted when thread's state is being saved.

fpe_state_save() leads to an assembly code in swtch.S which contains instructions which are equivalent to fpe_enable() and fpe_disable() and these calls are therefore not used here.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

if td1 is always curthread, then a kassert is needed

looks like td comes from ecall_handler(), which is curthread

jsihv_gmx.com added a reviewer: riscv.

I added MPASS macro following the example of AMD64's code in a corresponding place. I verified that the execution always visits ecall_handler() before it arrives here.

This revision is now accepted and ready to land.Fri, Feb 27, 9:41 AM
This revision was automatically updated to reflect the committed changes.

I am sorry to speak up only now, after the review sat open for so long. But in looking over the change it seems to be incomplete.

sys/riscv/riscv/vm_machdep.c
177

Looking at other implementations, we need to save the FPE context here as well.

This one is invoked when creating a new thread in the same process, while above is invoked when forking/creating a new process.

sys/riscv/riscv/vm_machdep.c
177

AMD64 & PowerPC seem to be like what you say but it looks like ARM64 does not save FP registers in cpu_copy_thread()

ARM64 calls vfp_new_thread() in both functions but this function just makes some modifications to a new thread _after_ it has been copied. In the case of AMD64 these both functions cpu_copy_thread() and cpu_fork() share a lot of common code in copy_thread() which calls fpuexit() which saves (after some further calls) fp-registers and thus corresponds the new code in this patch. PowerPC seems to be similar to AMD64. Both functions call cpu_update_pcb() which saves fp-registers to pcb.

So a question can be asked, do AMD64 and PowerPC do this just because their functions share a common function call together or do these differences (AMD64 & PowerPC vs. ARM64) reflect more genuine differences between these archs.

sys/riscv/riscv/vm_machdep.c
177

So, comments in ARM64's vm_machdep.c and differential (2db317c, April 2015) explain that cpu_fork() updates floating point registers to pcb because if the userland changes float values and then forks, it's possible to proceed to cpu_fork() without first visiting cpu_switch (which stores the registers). So ARM64 developers probably think this issue does not matter in the case of cpu_copy_thread(). But it would be good to get it confirmed ...and also if there's differences between archs here.