The modifier executes the operation using msr{read,write}_safe()
functions instead of plain msr reads and writes. Returns EFAULT if any
MSR access caused #GP.
x86_msr_op(9): consistently return the value read from MSR
x86_msr_op_one((9): execute SCHED ops in critical section
x86: provide extended description for x86_msr_op(9)Details
Diff Detail
- Repository
- rG FreeBSD src repository
- Lint
Lint Skipped - Unit
Tests Skipped
Event Timeline
Minor clarifications to the description.
I do not think that converting the comment into a man page is warranted.
I'm again not sure if this is the direction we want to go. As x86_msr_op() is about a single (read or read-modify-write) operation, it has a very short execution time. Not only using sched_bind() seems a big hammer for that (it has higher overhead), but also depending on the priority of the calling thread execution might be delayed a significant amount of time (with a real time thread already executing there, possibly indefinitely). To me, and IIRC one of our discussions for markj@ as well, the direction was more to deprecate using sched_bind() for this purpose, eventually retiring all MSR_OP_SCHED_* variants, whereas that patch *adds* new ones.
And, for a safe variant to work with MSR_OP_RENDEZVOUS*, then there's no other choice than to be able to take a fault on MSR access while interrupts are disabled, so we are back to D54996's discussion. I'm not sure why you mentioned GSBASE in the discussion; I read some of the interrupt handling assembly code long ago, and although I don't recall well I thought that the dances around this were already performed there, before we even reach trap(). And why faulting from controlled addresses, as is the case for rdmsr_safe() or wrmsr_safe(), even with interrupts disabled, would be wrong?
This discussion is really wider than just the MSR_OP_* topic, as we are already using *msr_safe() in callbacks triggered by smp_rendezvous() in at least hwpstate_amd(4).
In addition, for the potential use cases I'm seeing in hwpstate_amd(4) (but don't want to insist too much on this example, since we are using custom handlers which are probably not replaceable: We are rarely accessing a single MSR at a time, and when we modify a single one, we start from some cached value that we want to access from the target CPU; additionally, execution on some target CPU with interrupts disabled is also used as a mean to atomically modify the softc (a critical section would suffice though)), I'd like to:
- Use the rdmsr_safe() and wrmsr_safe() variants always.
- When specifying a read-modify-write operation, on failure, know if it's the read or only the write that failed (this may prove important for MSR that do not tolerate some values).
It's really a good idea that it is possible to retrieve back the MSR value before it is modified. It would be a requirement for hwpstate_amd(4).
Yes, IMO it is almost always wrong to use sched_bind() for short sections of code. I lost count of the number of times I had to debug problems caused by this pattern, but the first one was related to commit e42412a96695d0e9116de5c6ac75823d2711218d. For similar reasons, quiesce_all_cpus() should never be used either.
In D54996 there is a claim that FreeBSD is a realtime kernel, but this is not true in general. If a user thread binds to a specific CPU, there is no guarantee it will make progress. So if we want to have sysctl handlers which toggle unsafe MSRs, then either:
- we should tolerate #GP with interrupts disabled (which we already do I believe), or
- we need the scheduler to provide a hard guarantee that a thread bound to a specific CPU will be scheduled "soon", even if there is some high-priority realtime-class thread running on the CPU
| sys/x86/x86/cpu_machdep.c | ||
|---|---|---|
| 285 | I don't really understand--why not? Can't we handle #GP with interrupts disabled? | |