amd_intr() does not account for the offset (0x200) in the counter MSR address and ends up accessing invalid regions while reading counter value after the 4th counter (0xC001000[8,9,..]) and erroneously updates the counter values for counters [1-4].
amd_intr() should only check core pmcs for interrupts since other types of pmcs (L3,DF) cannot generate interrupts.
Newer AMD processors have an NMI latency, due to which pmc NMI's are ignored in certain scenarios such as:
Stopping a pmc ( pcd_stop_pmc() (hwpmc_mod.c)) followed by marking the pmc as free ( pcd_config_pmc( , , NULL)). In this scenario we receive an NMI after the pmc has been marked free and the interrupt ends being ignored.
A similar fix for linux was done here:
PMCs interrupting at the same time are collapsed into one interrupt. With the current interrupt handler, the majority of samples are missing for one of the event when two such events are used (unhalted_cpu_cycles and retired_intructions). Out of the two, the one which is allocated first reports all the samples while the other one reports negligible numbers.