Page MenuHomeFreeBSD

Execute mca_init() under the ap_boot_mtx spinlock.
ClosedPublic

Authored by kib on Apr 22 2018, 5:05 PM.
Tags
None
Referenced Files
Unknown Object (File)
Tue, Nov 19, 3:05 PM
Unknown Object (File)
Tue, Nov 19, 1:43 AM
Unknown Object (File)
Sun, Nov 17, 2:14 PM
Unknown Object (File)
Sun, Nov 17, 11:04 AM
Unknown Object (File)
Thu, Nov 14, 2:55 AM
Unknown Object (File)
Wed, Nov 13, 7:15 AM
Unknown Object (File)
Tue, Nov 5, 4:04 AM
Unknown Object (File)
Tue, Nov 5, 4:01 AM
Subscribers

Details

Summary

Otherwise, under bootverbose, the lapic_enable_cmc() banner 'lapicX: CMCI unmasked' is printed by several CPUs in parallel, causing garbled output for the LAPIC dumps.

Diff Detail

Lint
Lint Skipped
Unit
Tests Skipped

Event Timeline

Humm, that's not just a cosmetic error. cmci_monitor() assumes it is not being run concurrently on multiple CPUs so that if a given MC# bank is shared, only one CPU monitors it. It seems this is a very old bug of mine from the original CMCI commit. :( Arguably mca_init() should perhaps lock the mca_lock, but that is a bit trickier perhaps. The lock isn't initialized when mca_init() is called for the BSP. While I would prefer to do locking in mca.c for clarity reasons it would be a bit messier. This is ok, but perhaps add a comment to mca_init() saying we rely on the ap_boot_mtx to serialize it (just the boot-time mca_init(), not the resume mca_resume()).

Thanks for looking into this!

In D15157#319773, @jhb wrote:

Humm, that's not just a cosmetic error. cmci_monitor() assumes it is not being run concurrently on multiple CPUs so that if a given MC# bank is shared, only one CPU monitors it. It seems this is a very old bug of mine from the original CMCI commit. :( Arguably mca_init() should perhaps lock the mca_lock, but that is a bit trickier perhaps. The lock isn't initialized when mca_init() is called for the BSP. While I would prefer to do locking in mca.c for clarity reasons it would be a bit messier. This is ok, but perhaps add a comment to mca_init() saying we rely on the ap_boot_mtx to serialize it (just the boot-time mca_init(), not the resume mca_resume()).

I'm on vacaticions until the 28th, so I've got limited laptop/Internet access and no way to test changes, but maybe you could just pick the lock in _mca_init for the APs (PCPU_GET(cpuid) != 0), and leave the BSP path as-is?

IMO the issue with the cmci_monitor() is unrelated to this one. ap_boot_mtx provided the serialization for the banners, and mca_lock is for mca/mce. I updated the patch to fix both issues, but will split it for the commits, after the review.

Fix issue explained by John.

sys/x86/x86/mca.c
1106

It would be good to be consistent with testing for the BSP by either fixing other tests here to use IS_BSP. That said, there's a chance CPUID_MCE might be set but not MCA (e.g. Pentium) and in this case the mutex would not be initialized where it is locked now. I would perhaps move it inside the MCA block just around the for loop iterating over the banks. In that case you could also remove the BSP check to simplify the code and just use 'if (boot)' (which avoids the IS_BSP inconsistency thing).

sys/x86/x86/mca.c
1106

Can you explain why BSP check is not needed inside the loop ?

Move the locks inside the MCA block, keeping the IS_BSP check for now.

sys/x86/x86/mca.c
1129

It doesn't hurt to have the check, but by this point the lock will always be initialized so it's safe to acquire it anyway on the BSP even if it's kind of pointless merely to make the code simpler. Always acquiring the lock also means you probably can use a simpler comment of just 'Configuring CMCI monitoring must be single-threaded.' or some such.

Remove IS_BSP() check, simplify the comment.

This revision is now accepted and ready to land.Apr 24 2018, 6:23 PM
This revision was automatically updated to reflect the committed changes.