Page MenuHomeFreeBSD

x86: Halt non-BSP CPUs on panic IPI_STOP
ClosedPublic

Authored by cem on Apr 23 2019, 12:34 AM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, Nov 26, 2:29 PM
Unknown Object (File)
Mon, Nov 25, 6:57 AM
Unknown Object (File)
Mon, Nov 18, 4:28 PM
Unknown Object (File)
Mon, Nov 18, 4:24 PM
Unknown Object (File)
Sat, Nov 16, 11:56 AM
Unknown Object (File)
Thu, Nov 14, 5:13 PM
Unknown Object (File)
Thu, Nov 14, 4:51 PM
Unknown Object (File)
Tue, Nov 12, 4:44 PM
Subscribers

Details

Summary

We may need the BSP to reboot, but we don't need any AP CPU that isn't the
panic thread. Any CPU landing in this routine during panic isn't the panic
thread, so we can just detect !BSP && panic and shut down the logical core.

Test Plan

For some reason I thought we had already done something like this, but I'm not
seeing it.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

The hlt instruction there is a route in one way: only NMI or INIT IPI could unblock the core. Once I had a patch which allowed migrating the active thread in ddb from current cpu to arbitrary core, e.g. to look at the local state like APIC or MSRs not visible outside the CPU.

sys/x86/x86/mp_x86.c
1419 ↗(On Diff #56519)

Operator is usually kept at the line which is broken.

I think the expression can be simplified as !IS_BSP(), on x86 boot cpu has cpuid 0.

In D20019#430395, @kib wrote:

The hlt instruction there is a route in one way: only NMI or INIT IPI could unblock the core. Once I had a patch which allowed migrating the active thread in ddb from current cpu to arbitrary core, e.g. to look at the local state like APIC or MSRs not visible outside the CPU.

We could add a tunable to disable this behavior for such a patch, if you like.

sys/x86/x86/mp_x86.c
1419 ↗(On Diff #56519)

Thanks, that does look better.

In D20019#430453, @cem wrote:
In D20019#430395, @kib wrote:

The hlt instruction there is a route in one way: only NMI or INIT IPI could unblock the core. Once I had a patch which allowed migrating the active thread in ddb from current cpu to arbitrary core, e.g. to look at the local state like APIC or MSRs not visible outside the CPU.

We could add a tunable to disable this behavior for such a patch, if you like.

I do not think it is needed for now, it would only clog the patch.

In D20019#430480, @kib wrote:
In D20019#430453, @cem wrote:

We could add a tunable to disable this behavior for such a patch, if you like.

I do not think it is needed for now, it would only clog the patch.

Ok, I will leave it out but fix the IS_BSP condition.

Use more straightforward IS_BSP macro

cem marked an inline comment as done.Apr 23 2019, 11:36 PM
This revision is now accepted and ready to land.Apr 24 2019, 8:04 AM
sys/x86/x86/mp_x86.c
1400 ↗(On Diff #56559)

You do not need this variable anymore.

sys/x86/x86/mp_x86.c
1400 ↗(On Diff #56559)

Ah, thanks for catching that. Will fix before commit.

This revision was automatically updated to reflect the committed changes.

A belated note that it is possible use monitor/mwait mechanism to get the power saving and inter-processor signaling at the same time.
I once had it implemented, but that code is quite out of date now. Also, it was interleaved with another, unrelated change.
Some bits:

In D20019#431068, @avg wrote:

A belated note that it is possible use monitor/mwait mechanism to get the power saving and inter-processor signaling at the same time.
I once had it implemented, but that code is quite out of date now. Also, it was interleaved with another, unrelated change.

Yes, I'm all for a version that uses that in preference (conditional on the monitor/mwait cpuid bit). It could even be used for the spin, albeit maybe increasing wakeup latency for fast IPI_STOPs (rendezvous operations? No, I guess that's a different IPI). Maybe there is no such issue as IPI_STOP needing low latency wakeup.

In D20019#431206, @cem wrote:
In D20019#431068, @avg wrote:

A belated note that it is possible use monitor/mwait mechanism to get the power saving and inter-processor signaling at the same time.
I once had it implemented, but that code is quite out of date now. Also, it was interleaved with another, unrelated change.

Yes, I'm all for a version that uses that in preference (conditional on the monitor/mwait cpuid bit). It could even be used for the spin, albeit maybe increasing wakeup latency for fast IPI_STOPs (rendezvous operations? No, I guess that's a different IPI). Maybe there is no such issue as IPI_STOP needing low latency wakeup.

I thought Anton implemented that in OneFS...?

In D20019#431206, @cem wrote:

Yes, I'm all for a version that uses that in preference (conditional on the monitor/mwait cpuid bit). It could even be used for the spin...

I thought Anton implemented that in OneFS...?

Yes, there is a version there. It doesn’t have the fallback behavior. I’d like to upstream it or at least start discussion. I had forgotten about it when I submitted this reiteration of the same basic idea — I was clearing out an old git tree from 2016 or 2015 to free some disk space and had forgotten the context.