The inout instruction in AMD SVM requires CPU feature to deocde the
segment override prefix. However, without that feature, we are still
able to decode manually by fetch the instruction directly.
Details
- Reviewers
khng markj - Group Reviewers
bhyve - Commits
- rGc18c521c79b6: vmm: Support INOUT manual decode.
Diff Detail
- Repository
- rG FreeBSD src repository
- Lint
Lint Passed - Unit
No Test Coverage - Build Status
Buildable 65526 Build 62409: arc lint + arc unit
Event Timeline
sys/amd64/vmm/amd/svm.c | ||
---|---|---|
747 | Note that vmm_fetch_instruction() can return EFAULT, so the caller's test s >= 0 might ignore errors. It would be better to return the segment number using a separate parameter, and use the return value to signal an error or success (0). | |
750 | ||
755 | I'd prefer this to be an explicit panic(). |
Tested on AMD Opteron(tm) Processor 6136.
I am thinking about providing instruction buffer like uint8_t inst[VIE_INST_SIZE] only instead of the whole vie to vmm_fetch_instruction by force type casting since I only want the first byte of the instruction. Do you think it is safe? I have taken a look on the vmm_fetch_instruction function and think that it won't create any out of bound.
Some minor nits to the commit message:
- deocde -> decode
- Maybe we can directly call "DecodeAssist feature" instead of "CPU feature to decode"?
- by fetch -> by fetching
- My personal preference is using capital after the component name ("vmm" here), so the title could be "vmm: Support INOUT manual decode"
sys/amd64/vmm/amd/svm.c | ||
---|---|---|
752 | Why not use vmm_decode_instruction()? It will find the segment index for you, and also handles the rep prefix. From reading the OUTS instruction description, I think you have to handle the default prefix case too. |
You are right, the string can be with rep prefix. At the same time, I think we can decode the prefix only because all of other informations is contained inside the vmexit info.
sys/amd64/vmm/amd/svm.c | ||
---|---|---|
738 | Should we initialize fault to 0? I believe (albeit with invalid inputs) vm_copy_setup can return non-zero but not set *fault and it seems generally reasonable to ensure it's initialized. |
sys/amd64/vmm/amd/svm.c | ||
---|---|---|
738 | I thought
have already addressed that? |
sys/amd64/vmm/amd/svm.c | ||
---|---|---|
752 | I don't think the value vme->u.inst_emul.gla is valid. That gets set in svm_inout_str_seginfo(), but that's for a VMCB_EXIT_NPF exit. Actually, does this use of vmm_decode_instruction() work at all? outs is not listed in one_byte_opcodes[], so won't decode_opcode() fail? | |
756 | I think we want to also verify that the decoded instruction is an outs, since it's possible that the guest modified the instruction, but see my other comment above. |
sys/amd64/vmm/amd/svm.c | ||
---|---|---|
752 | I do check on the Opteron machine and the guest starts without problem. I also check the kernel without this patch and it shows the expected vmexit fault and the bhyve stop immediately. But you are right, it is strange that decode_opcode works since the instruction is not listed. vm exit[0] |
sys/amd64/vmm/amd/svm.c | ||
---|---|---|
752 | Ah, I think you are missing a vie_init() call? |
sys/amd64/vmm/amd/svm.c | ||
---|---|---|
752 | Emm, I don't think vie_init is necessary since it just copy the instruction buffer and set the length. In the previous codes do fetch + decode inside the I try to do some printf in vmm.ko and it shows the binary only execute INS without OUTS. That is the reason why it works since the original check in if (inout_string && !decode_assist()) return (UNHANDLED); is regardless of you are in or out. Also, the in direction cannot be overrides. A quick workaround now is that we can move the UNHANDLED check to svm_inout_str_seginfo and make it check the out direction only. |
sys/amd64/vmm/amd/svm.c | ||
---|---|---|
752 | However, what we need actually is the prefix only. Maybe we can make decode_prefix global then call decode_prefix only? |
sys/amd64/vmm/amd/svm.c | ||
---|---|---|
752 |
Well, it also zeroes the structure. In particular, vie.op is zeroed, so vie->op.op_type == VIE_OP_TYPE_NONE. I think the test in decode_opcode() is returning early because it's testing uninitialized memory.
There, the vie_init() happens in vmexit_inst_emul(), before we exit the inner loop. After exiting the loop, vm_handle_inst_emul() uses it (possibly more than once).
So, let's try handling only INS then? Assuming that that's sufficient, anyway. Then we can hard-code ES. BTW, did you verify that other info1 fields are correct, even without DecodeAssist?
I prefer not to take such shortcuts. If we want to handle OUTS as well, then we should 1) fetch and decode the full instruction, 2) verify that it is indeed an OUTS, 3) extract the segment register. |
sys/amd64/vmm/amd/svm.c | ||
---|---|---|
751 | This definitely needs a comment explaining what we're doing here. | |
sys/amd64/vmm/vmm_ioport.c | ||
172 | This is too magical. I prefer something like the following:
| |
182 | This should still verify that the decoded instruction is an outs. |
sys/amd64/vmm/vmm_ioport.c | ||
---|---|---|
172 |
Checked by _Static_assert |
This is looking good, my comments are minor.
sys/amd64/vmm/amd/svm.c | ||
---|---|---|
740 | ||
753 | ||
754 | ||
755 | ||
756 | ||
762 | Now it is much easier to see that error != 0 is impossible here. So, I think the explicit panic() is no longer necessary. Let's make svm_get_cs_info() return void, and internally it can assert error == 0, similar to VMX inout_str_seginfo(). | |
830 | Please initialize these fields in VMX inout_str_seginfo() too. | |
sys/amd64/vmm/vmm_ioport.c | ||
170 | It is ok to move the enum to vmm_instruction_emul.h (just be careful to verify that bhyve still compiles, since vmm_instruction_emul.* is compiled in userspace as well). Or, you can add some predicate function vie_instruction_is_outs() to vmm_instruction_emul.c. Either option is fine IMO. |