x86_msr_op: extend the KPI to allow MSR read and single-CPU operations
Details
Diff Detail
- Repository
- rG FreeBSD src repository
- Lint
Lint Not Applicable - Unit
Tests Not Applicable
Event Timeline
I have no objections if you prefer it this way, but to me this x86_msr_op() KPI looks excessively over-engineered.
It grow according to the requirements of the callers, mostly from the mitigation department. Initially, code used manual rendezvous invocation, similar to the coretemp instance. Once there were around ten cases, each slighly different, it is worth to have clumsy KPI that allowed to eliminate 20-30 lines of code from each invocation.
sys/x86/x86/cpu_machdep.c | ||
---|---|---|
194–195 | The pattern of binding to another CPU to access some per-CPU resource is often problematic, and we should discourage it IMO. There is one user of MSR_OP_SCHED_ALL, and it's susceptible to the same problem as coretemp if one runs cpucontrol -e - why does it not use IPIs? |
I got that. And in existing shape it looks pretty much consistent for applying same bit-mask operation to all CPUs MSRs. I am just not sure (haven't looked close) why does it need both binding and rendezvous mechanisms. Adding single CPU specification and reading (which implies CPU specification) looks alien there as for me.
sys/x86/x86/cpu_machdep.c | ||
---|---|---|
194–195 | It is problematic under heavy load and high-frequency access. So it is reasonable to change coretemp to use rendezvous, but there are MSR accesses that require threading context. I am not sure why SSBD was done with binding and not rendezvous (probably it just happens), but I am against removing binding interface. For instance, cpuctl(4) can/should be moved to x86_op_msr, and safe MSR access does require threading context. Lets postpone SSBD conversion, I will examine it and change to rendezvous if there were no reason to use binding, but this would be a separate change anyway. |
sys/x86/x86/cpu_machdep.c | ||
---|---|---|
225 | For at least this case I think you can correctly specify smp_no_rendezvous_barrier, otherwise the target will needlessly perform extra atomic increments. Maybe smp_rendezvous_cpus() should micro-optimize that case internally. |