Page MenuHomeFreeBSD

arm64: Fix kernel panic in get_arm64_sve during core dump
AcceptedPublic

Authored by wac_gmail.com on Mon, Jan 5, 6:17 PM.
Tags
None
Referenced Files
F142636013: D54532.id169936.diff
Wed, Jan 21, 5:33 PM
Unknown Object (File)
Tue, Jan 20, 3:28 AM
Unknown Object (File)
Sun, Jan 18, 3:55 AM
Unknown Object (File)
Sat, Jan 17, 5:35 PM
Unknown Object (File)
Sat, Jan 17, 12:19 PM
Unknown Object (File)
Sat, Jan 17, 8:00 AM
Unknown Object (File)
Sun, Jan 11, 10:26 PM
Unknown Object (File)
Sat, Jan 10, 5:21 AM
Subscribers

Details

Reviewers
andrew
manu
Summary

The coredump logic calls get_arm64_sve twice: once to get the note size, and once to get the data. The note size calculation depended on the volatile PCB_FP_SVEVALID flag. If this flag was cleared between the two calls (e.g., due to a context switch clearing the flag to comply with the ABI), the second call would expect a smaller buffer size than the first, triggering a KASSERT panic ("invalid size").

Fix this by ensuring the SVE state is saved to the PCB before we decide whether to use SVE or VFP.

PR: 292195

Test Plan

Triggered core dump repro from bug 292177 and observed a good core file and no kernel panic.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

Would it be enough to move the td == curthread .. check & call to vfp_save_state before checking for PCB_FP_SVEVALID in line 942?

I don't think moving vfp_save_state(td) is sufficient because of lazy SVE restoration. If the thread hasn't executed SVE instructions since a context switch, the trap to re-enable the unit won't have fired. In that state, vfp_save_state(td) sees the unit as disabled and won't set PCB_FP_SVEVALID and we'll still potentially see this panic. Checking pcb_svesaved != NULL avoids this by checking the backing store directly.

After the first call to vfp_save_state from get_arm64_sve with buf == NULL the PCB_FP_SVEVALID flag will be set correctly based on if pcb_svesaved contains valid data or not (it may be clear if called while in a syscall). If we then context switch then vfp_save_state_switch will exit early as VFP is disabled so PCB_FP_SVEVALID isn't changed & later calls to vfp_save_state will also not change the state.

It's ok to drop the SVE state if the thread is in a system call (other than sigreturn) as the ABI doesn't guarantee it will remain valid.

I tested a patch to move vfp_save_state & haven't hit the KASSERT, even when loading the system down to try cause more context switching.

Incorporated your suggested change just moving the vfp_save_state and tested successfully. Thank you for your patience on this.

Can you send ma a git patch with this change? e.g. the output of git format-patch

This revision is now accepted and ready to land.Mon, Jan 19, 12:28 PM