Page MenuHomeFreeBSD

Free MCA entries after logging
ClosedPublic

Authored by jtl on May 31 2019, 5:22 PM.
Tags
None
Referenced Files
F108515333: D20482.diff
Sat, Jan 25, 7:45 PM
Unknown Object (File)
Sun, Jan 19, 7:28 AM
Unknown Object (File)
Tue, Jan 14, 1:31 PM
Unknown Object (File)
Sat, Jan 11, 7:54 AM
Unknown Object (File)
Sat, Jan 11, 6:43 AM
Unknown Object (File)
Wed, Jan 8, 10:15 AM
Unknown Object (File)
Dec 23 2024, 6:59 AM
Unknown Object (File)
Dec 21 2024, 4:14 PM

Details

Summary

Currently, MCA entries remain on an every-growing linked list. This means that it becomes increasingly expensive to process a steady stream of correctable errors.

After logging an entry, free it to the free list. Expand the free list maintenance routine to free or allocate entries to keep the free list in an acceptable range.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 24746
Build 23507: arc lint + arc unit

Event Timeline

So I have had tools in the past that parsed the list from the kernel. See https://github.com/freebsd/freebsd/compare/master...bsdjhb:libsmbios_ecc. At the very least I think there should perhaps be a tunable/sysctl to control the behavior.

In D20482#442176, @jhb wrote:

So I have had tools in the past that parsed the list from the kernel. See https://github.com/freebsd/freebsd/compare/master...bsdjhb:libsmbios_ecc. At the very least I think there should perhaps be a tunable/sysctl to control the behavior.

Yeah, I wondered if there was a reason like this for the current behavior.

Would you be open to the following sort of (perhaps over-engineered) solution?

  • Split mca_records into two lists: those waiting to be logged and those which have already been logged. (This avoids the CPU spinning through an every-increasing list every time there is a new error.)
  • Create an upper bound on the number of records on the "has been logged" list, which can default to the current behavior of "keep everything".

I think having two lists with a cap on already-logged is fine, sure.

Try 2:
Maintain the last N records in the mca_records list. N is user-configurable and defaults to -1 (unlimited; the current behavior).

When the mca_records list grows beyond N records, free the oldest ones.

This revision is now accepted and ready to land.May 31 2019, 8:55 PM
sys/x86/x86/mca.c
125

Extra blank line?

561

This should perhaps be renamed to 'mca_resize_freelist' since it can now shrink as well as grow it.

563–565

I have a slight preference for 'desired_max, desired_min' instead of 'low' and 'high'.

569

Maybe shorten a bit by saying something like:

* Ensure we have at least one record fore each bank and
* one record per CPU, but no more than twice that amount.
* Free records first in case a concurrent thread uses records
* while this thread is freeing them.

One thing you might consider is to do a batched free to avoid banging on the lock as much. Something like this:

STAILQ_HEAD(, mca_internal) tofree;
struct mca_internal *rec, *next;
int desired_min, desired_max;

desired_min = ...
desired_max = ...
mtx_lock_spin(&mca_lock);
if (mca_freecount > desired_max) {
     STAILQ_INIT(&tofree);
     while (mca_freecount > desired_max) {
          rec = STAILQ_FIRST(&mca_freelist);
          STAILQ_REMOVE_HEAD(&mca_freelist, link);
          STAILQ_INSERT_TAIL(&tofree, rec, link);
          mca_freecount--;
     }
     mtx_unlock_spin(&mca_lock);
     STAILQ_FOREACH_SAFE(rec, &tofree, link, next) {
          free(rec, M_MCA);
     }
     mtx_lock_spin(&mca_lock);
}
while (mca_freecount < desired_min) {
     ...
}

Incorporate feedback from @jhb.

This revision now requires review to proceed.May 31 2019, 11:25 PM
jtl marked 4 inline comments as done.May 31 2019, 11:27 PM
jtl added inline comments.
sys/x86/x86/mca.c
125

Removed.

561

Done. I also changed other "refill" things to "resize".

563–565

Changed. (Doh! count is now not in alphabetical order. I'll try to remember to fix that before committing.)

569

I mostly did this, but implemented it slightly differently.

jhb added inline comments.
sys/x86/x86/mca.c
591–592

For this I think you can use STAILQ_CONCAT to join the lists as a single O(1) operation.

    mtx_lock_spin(&mca_lock);
    STAILQ_CONCAT(&mca_freelist, &tmplist);
}
629–630

I feel like we can maybe drop this now since you are doing this in the interrupt handler that calls this as well. In particular, this might queue the task even though the interrupt handler is going to free the record after logging it.

This revision is now accepted and ready to land.Jun 6 2019, 12:14 AM
jtl marked 4 inline comments as done.Jun 6 2019, 12:43 AM
jtl added inline comments.
sys/x86/x86/mca.c
591–592

Good point! I'll make the change.

629–630

The interrupt handler will only queue the task if it actually does free a record. By default (hw.mca.maxcount == -1), it won't free any records, so it won't queue the task. If a user changes hw.mca.maxcount to a non-negative number, we will probably queue the task more than necessary. AFAICT, the only reliable way around that is to only queue the task in the interrupt handler after logging. However, that may delay refilling the free list longer than may be desirable. Or, is that not an issue and we should just unconditionally queue the task at the end of the interrupt handler?

sys/x86/x86/mca.c
629–630

It probably is fine to just wait for the end of the interrupt handler and do it if the scan found anything. (So just 'if (!cold)' instead of needing the 'doresize' check.) It will be simpler all around I think.

Changes based on review:

  • Switched to using STAILQ_CONCAT to refill the free list.
  • Dropped redundant calls to resize the free list.
  • Centralized the logging logic to reduce code duplication.
This revision now requires review to proceed.Jun 7 2019, 9:49 AM
jtl marked 3 inline comments as done.Jun 7 2019, 10:15 AM

In my local tree, I added functions to generate fake machine check records and exercise the logic now found in mca_process_records(). It appears to work correctly. I *think* this is now ready to land.

This revision is now accepted and ready to land.Jun 7 2019, 6:36 PM
This revision was automatically updated to reflect the committed changes.