Details
- Reviewers
- markj 
- Commits
- rG2517b2085b58: kinst: use per-probe trampolines in riscv
Diff Detail
- Repository
- rG FreeBSD src repository
- Lint
- Lint Skipped 
- Unit
- Tests Skipped 
- Build Status
- Buildable 52633 - Build 49524: arc lint + arc unit 
Event Timeline
| sys/cddl/dev/kinst/riscv/kinst_isa.c | ||
|---|---|---|
| 17 | What is a "per-CPU probe"? | |
| 295 | This should be done using the DPCPU macros, same as intr_probe. | |
| 298–299 | What if the cached value of sstatus is 0? Then the code doesn't distinguish between two different cases. It would be better to have an explicit state machine. Try defining a per-CPU struct: struct kinst_cpu_state {
    enum {
        KINST_PROBE_ARMED,
        KINST_PROBE_FIRED,
    } state;
    struct kinst_probe *kp;
    register_t sstatus;
};When a probe is enabled, it enters state ARMED. When it fires, kinst_invop() calls dtrace_probe() and sets the state to FIRED. If single-stepping is done using a trampoline, then the second call to kinst_invop() will start in state FIRED, and there it will go back to ARMED. If single-stepping is done using emulation, then the state goes immediately from FIRED back to ARMED. In particular, kinst_invop() should be simpler. Now that each probe has its own trampoline, and trampolines are executed with interrupts disabled, kinst_invop() doesn't need to check SPIE, and I think t_kinst_curprobe isn't needed. | |
| 553 | Where does this trampoline get freed? | |
| sys/cddl/dev/kinst/riscv/kinst_isa.c | ||
|---|---|---|
| 17 | It is the per-CPU equivalent of t_kinst_curprobe. I noticed that the kernel was crashing if I tried to fetch | |
| 553 | In kinst_unload() when it calls kinst_trampoline_deinit(). | |
| sys/cddl/dev/kinst/riscv/kinst_isa.c | ||
|---|---|---|
| 298–299 | Since amd64 is going to eventually also use the double-breakpoint mechanism, and arm64 will need this as well, do you think we should add the struct definition to kinst.h and replace register_t sstatus with a generic register_t field? | |
| sys/cddl/dev/kinst/riscv/kinst_isa.c | ||
|---|---|---|
| 17 | But what exactly is the issue? If the trampoline is executed with interrupts disabled, there should be no way to re-enter kinst_invop() while executing a trampoline. | |
| 298–299 | I don't have a strong opinion on whether the implementation should be per-platform or generic. It's not a lot of code either way. I just find the current implementation hard to follow and believe it should be simpler. | |
| 553 | It should be freed when the probe is destroyed, if possible. Otherwise the memory will remain allocated even when the trampoline is unused. | |
This is looking good, just minor comments.
| sys/cddl/dev/kinst/riscv/kinst_isa.c | ||
|---|---|---|
| 297 | We can and should declare kp const. In particular, multiple CPUs might be sharing a reference to a single probe; we should make sure that kinst_invop() doesn't mutate any state in the probe itself. | |
| 329 | I don't see a point in updating the state only to update it again a few lines below. This can just go after the emulate check. | |
| 336 | We don't need to handle this case anymore, now that trampoline allocation is handled at the time that probes are enabled. | |
| 553 | You need to check kp_tramp == NULL and return ENOMEM if it happens. | |
| 573 | I think you can DPCPU_DEFINE a struct, there's no need for dynamic allocation. Then use DPCPU_PTR() to get a pointer to that struct. | |
| sys/cddl/dev/kinst/riscv/kinst_isa.c | ||
|---|---|---|
| 349 | Sorry, I don't understand - why do we need to populate the trampoline upon every call to kinst_invop()? It needs to be done only once, at the time the probe is created. | |
I'm happy with this change with the one comment addressed.
| sys/cddl/dev/kinst/riscv/kinst_isa.c | ||
|---|---|---|
| 573 | This comment still applies. My point is that you're introducing an unnecessary layer of indirection here. You should be able to write DPCPU_DEFINE_STATIC(struct kinst_cpu_state, kinst_state); then, any time you want to access the state, write ks = DPCPU_PTR(kinst_state);. Implementing it this way requires less code and is more efficient. | |