Always print the first 50 messages of each type. After that, optionally rate-limit the messages. This provides a way to limit the overhead of processing excessive messages without suppressing the first few of any type.
As part of this change, we are switching from direct printf() calls to collecting data in an sbuf(9). In POLLED mode (run from a task queue), we dynamically allocate the buffer. In the other modes (which are likely called from a hardware interrupt), we use a buffer allocated from the BSS segment and guarded by a lock. In normal operation, most calls to mca_log() should come from the POLLED mode, so there should be no contention for the new lock. If there is an interrupt storm which exceeds the capacity of the free list, there will be new contention for this lock; however, overall lock contention should still be lower than it was prior to e770e32aa3a0, when the mca_lock was held for the entirety of the mca_log() call.
This commit is partly based on a patch proposed by Loic Prylli <lprylli@netflix.com>.