As suggested by jhb@.
Details
I will give it a reboot and see if interrupts work :).
Diff Detail
- Repository
- rS FreeBSD src repository - subversion
- Lint
Lint Not Applicable - Unit
Tests Not Applicable
Event Timeline
First make sure it works. :) If it does, you might just want to do the save/restore unconditionally rather than only for these parts as it wouldn't surprise me if other devices might reset other PCI registers on reset? I'm happy to defer to more well-informed comments on how isolated this is to the specific model (BDXDE) however.
I've tested it now. It does appear to work.
If it does, you might just want to do the save/restore unconditionally rather than only for these parts as it wouldn't surprise me if other devices might reset other PCI registers on reset?
The word from Intel is that only these PCIIDs do it, and they intend to fix the behavior in future hardware revisions.
I'm happy to defer to more well-informed comments on how isolated this is to the specific model (BDXDE) however.
From Jim Harris,
BDX-DE CB-DMA is based on a different underlying DMA unit than the ones in the mainline Xeon processors. This DMA unit effectively does a PCI function level reset when you reset the DMA unit. This means you have to reset the IRQs after the reset operation. It is not ideal, but the Linux driver has a workaround routine that checks for this type of CB-DMA and does the reinitialization if necessary.
From Dave Jiang at Intel (Linux ioat maintainer),
Yep this sounds like what it's doing. The Linux driver re-init the MSIX interrupts after a chan reset to workaround the issue. You can key off the PCI dev id to only do that for those devices on BDX-DE. We are hoping that will be fixed next gen. This is not an issue for the CBDMA v3.2 devices that are on Xeon EP CPUs. So far all affected devices are CBDMA v3.3 based.
sys/dev/ioat/ioat.c | ||
---|---|---|
989–990 ↗ | (On Diff #8431) | Would this case need to be handled too? |
Seems good to me then, thanks!
Regarding @ngie's question, I'm not sure what the right thing to do is if the reset fails. In theory it doesn't hurt to do a restore as it should be a no-op if nothing happened, but it seems dubious to be messing with the hardware at all if it fails to reset.
Technically the BWD (stands for Briarwood - an Atom-based SOC) dev IDs should be treated the same as BDX (Broadwell) DE (aka Xeon D). I have no Briarwood hardware to test on though. But should not hurt to just add those device IDs anyways.