Page MenuHomeFreeBSD

hdac_intr_handler: keep working until global interrupt status clears
ClosedPublic

Authored by avg on Jun 4 2020, 7:00 AM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, Dec 25, 11:59 PM
Unknown Object (File)
Sat, Dec 14, 3:49 PM
Unknown Object (File)
Dec 8 2024, 11:21 PM
Unknown Object (File)
Dec 4 2024, 7:28 AM
Unknown Object (File)
Sep 13 2024, 11:10 PM
Unknown Object (File)
Sep 5 2024, 10:51 PM
Unknown Object (File)
Sep 5 2024, 9:44 PM
Unknown Object (File)
Aug 26 2024, 8:04 AM

Details

Summary

It is plausible that hardware interrupts a host only when GIS goes from zero
to one. GIS is formed by OR-ing multiple hardware statuses, so it's
possible that a previously cleared status gets set again while another
status has not been cleared yet. Thus, there will be no new interrupt as
GIS always stayed set. If we don't re-examine GIS then we can leave it set
and never get an interrupt again.

Without this change I frequently saw a problem where snd_hda would stop
working. Setting dev.hdac.1.polling=1 would bring it back to life and
afterwards I could set polling back to zero. Sometimes the problem started
right after a boot, sometimes it happened after resuming from S3, frequently
it would occur when sound output and input are active concurrently (such as
during conferencing).
I looked at HDAC_INTSTS while the sound was not working and I saw that both
HDAC_INTSTS_GIS and HDAC_INTSTS_CIS were set, but there were no interrupts.

I have collected some statistics over a period of several days about how
many loops (calls to hdac_one_intr) the new code did for a single interrupt:

LoopsTimes Happened
0301
112857746
2280
32
4+0

I believe that previously the sound would get stuck each time we had to loop
more than once.

The hardware in question:

hdac1: <AMD (0x15e3) HDA Controller> mem 0xfe680000-0xfe687fff at device 0.6 on pci4
hdacc1: <Realtek ALC269 HDA CODEC> at cad 0 on hdac1
Test Plan

See the summary.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Passed
Unit
No Test Coverage
Build Status
Buildable 31479
Build 29089: arc lint + arc unit

Event Timeline

avg requested review of this revision.Jun 4 2020, 7:00 AM

Make sure you test both MSI and legacy interrupts, because they may work very different. I don't have objections for the patch if it helps some broken hardware, but comparing to AHCI, the last generates another interrupt when interrupt status register is cleared while there is some more work left, avoiding another expensive register read. Second read should not be a huge deal for sound, but still it is not free.

sys/dev/sound/pci/hda/hdac.c
344

What happened to this write? CIS and SIS bits there are RW1C, so must be written after the interrupt is processed.

sys/dev/sound/pci/hda/hdac.c
344

According to the original Intel HDA specification they are RO.
On my hardware they also seem to behave like RO.
Also, it seems that Linux never writes them too.

sys/dev/sound/pci/hda/hdac.c
344

The original specification is from 2004, and there they are RW1C. Not sure what they were thinking.

sys/dev/sound/pci/hda/hdac.c
344

Oh, I didn't even know about that original original specification.
I guess now I know from where AMD copied their wording.

This revision was not accepted when it landed; it landed in state Needs Review.Jun 18 2020, 9:20 AM
This revision was automatically updated to reflect the committed changes.