Page MenuHomeFreeBSD

bhyve(8): For prototyping, reattempt decode in userspace

Authored by cem on Apr 17 2020, 4:47 AM.



If userspace has a newer bhyve than the kernel, it may be able to decode
and emulate some instructions vmm.ko is unaware of. This is more for
prototyping than anything else.

Diff Detail

rS FreeBSD src repository - subversion
Automatic diff as part of commit; lint not applicable.
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

Nice change. GCE has moved instruction emulation out of kernel KVM, and it would be interesting for bhyve to also have that ability: this is a step towards that.

This revision is now accepted and ready to land.Apr 17 2020, 5:17 AM

Well, it may be somewhat broken. I can’t tell if it’s just vmexit for every APIC poll which is bog-slow, or if emulation behavior of bextr is incorrect, or what, but my VM with the VEX instruction mostly hangs when it gets here. How do you usually debug this kind of code?

(I have logic not shown here to print out userspace-decoded vies, but after the initial implementation that print became way too slow to run the VM.)

One rough edge is that the !decoded vie from the kernel vmexit may contain partial decode state from the kernel. It isn’t a problem for (valid) VEX instructions, as those have a unique prefix byte we never handled before. But we may need something like vie_init() that doesn’t trash the thread-specific register state.

Debugging the APIC is a slog. Start with KTR compiled in an single vCPU guests. See what accesses are being done with BEXTR and what the device emulation is returning. Use FreeBSD as a guest and ddb to put in guest breakpoints to allow the ktr log to be examined. You may have to spend a lot more time with SDM Vol3 Ch10 than you would care for.

Turns out it was just missing these kernel-emulated devices in our userspace memory map. I haven't tested D24525 yet, but barring bugs, I think that should resolve this problem.

Actually I guess there is no dependency relationship here. We can fallback to userspace decode just fine for memory regions userspace has knowledge of. Just my bad luck that the first missing instruction happened to be in a PIC access rather than PCI MMIO.

cem planned changes to this revision.Apr 22 2020, 9:10 PM

(I intend to reset any partial vie decoding prior to re-attempting decode.)

Reset decoding state before attempting decode in userspace.

This is significant because a partial decode could advance at least the vie
cursor (num_processed). (Maybe other state relies on the initial zero values,
but this one definitely does.) Be careful not to discard the instruction
contents during restart.

Sorry about the long delay! Finally clearing some backlog.

Minor update: In the seemingly impossible case where inst_length is zero, don't
leave trash in the vie->num_valid field.

This revision is now accepted and ready to land.Jun 23 2020, 11:44 PM

Need to figure out a way to get the userspace build to observe the new kernel headers. The <machine/foo> symlink headers aren't pointing at the ones in SRCTOP. I guess I could put __FreeBSD_version conditional dummy prototypes in a userspace header and bump that?

Or change the build to point at kernel source headers directly, something like this: Thoughts?