Page MenuHomeFreeBSD

x86: for wrmsr_early_safe(), catch all exceptions, not only #GP
ClosedPublic

Authored by kib on Wed, May 27, 8:25 AM.
Tags
None
Referenced Files
Unknown Object (File)
Fri, Jun 19, 9:53 PM
Unknown Object (File)
Wed, Jun 17, 2:32 AM
Unknown Object (File)
Wed, Jun 17, 2:31 AM
Unknown Object (File)
Wed, Jun 17, 1:52 AM
Unknown Object (File)
Wed, Jun 17, 12:20 AM
Unknown Object (File)
Tue, Jun 16, 2:49 PM
Unknown Object (File)
Tue, Jun 16, 10:17 AM
Unknown Object (File)
Tue, Jun 9, 10:07 AM

Details

Summary

Only #GP and #UD are documented for WRMSR by SDM, but idea is that it might be possible that something else happens.

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

kib requested review of this revision.Wed, May 27, 8:25 AM

Since wrmsr_early_safe_gp_handler() pops the hardware error code, isn't that going to botch upper frames of the stack for exceptions not pushing one?

Since wrmsr_early_safe_gp_handler() pops the hardware error code, isn't that going to botch upper frames of the stack for exceptions not pushing one?

This is exactly what the comment above the loop says. But the expectation is that such exceptions should not be raised by WRMSR at all, like divide fault or debug exception. I initially though of catching the page fault, but then decided to catch everything. I believe that prot fault, page fault, and possibly stack fault are the most important.

Correctly handle exceptions without error codes.

In D57264#1312294, @kib wrote:

Simplify greatly.

Indeed. Was going to ask about the stack's computations in the previous version (to have some comment describing it; it looks correct to me according to the SDM, but I'm not an expert).

My only remaining worry now is that the SDM repeats in a number of places that IRETQ must be used to return from handlers (don't know what would be the practical consequences of not doing it; from other parts of the SDM, it looks like that doesn't really matter for regular operation; not sure for shadow stacks, but we do not support them anyway?).

My only remaining worry now is that the SDM repeats in a number of places that IRETQ must be used to return from handlers (don't know what would be the practical consequences of not doing it; from other parts of the SDM, it looks like that doesn't really matter for regular operation; not sure for shadow stacks, but we do not support them anyway?).

Even if we would support shadow stacks, we definitely do not need/enable them at the boot stage where wrmsr_early_safe() is used.
IRET only matters for NMIs, there if we get NMI we already dead. If we are not, NMIs will be enabled by next IRET.

Thanks for the explanations.

This revision is now accepted and ready to land.Wed, May 27, 12:47 PM

Any chance we can get a version of this which records which other exception is happening? I don't think we want that in production but if we can get someone who is experiencing the issue to run that it would be good to make sure we fully understand the problem.

(I'm more concerned about "full understanding" now than I would normally be since I might want to land this in a 15.1-RC2 tomorrow...)

Any chance we can get a version of this which records which other exception is happening? I don't think we want that in production but if we can get someone who is experiencing the issue to run that it would be good to make sure we fully understand the problem.

Such version would require much more efforts, and I currently do not think it is justified. In reported cases where the patch 'helps', it was said that ucode was updated. This means that no exception is generated.
My guess right now that we have an issue with the layout.

(I'm more concerned about "full understanding" now than I would normally be since I might want to land this in a 15.1-RC2 tomorrow...)

I do not know what happens, at all.