On the fully IO-coherent systems the bus_dmamap_sync must guarantee that all memory access will be finished after the sync function is called. On armv8, where the map is null, the instruction reordering was observed, but this is not the only architecture prone to that issue. This patch fixes the behavior.
Details
Diff Detail
- Repository
- rS FreeBSD src repository - subversion
- Lint
Lint Not Applicable - Unit
Tests Not Applicable
Event Timeline
sys/sys/bus_dma.h | ||
---|---|---|
295 ↗ | (On Diff #6749) | Is there a performance advantage to using a finer-grained version of mb() based on the value of 'op' ? |
This is simultaneously excessive and not enough on x86. Arguably, on x86 we do need to flush the store buffers, but we could need the flush on all CPUs and not only the current one. On the other hand, mb() does much more, in particular, it does full serialization for the current CPU, which is not needed.
I would prefer to have this change MD.
Sure. However, I'd like to use a negative logic here. I mean, to use mb as a default and disable it for a known list of architectures (afaik only Intel/amd64 don't need a barrier here).
Since I'm no x86 expert, do you think the '#if !defined(i386) && !defined(amd64)' is sufficient in this case?
For instance, I think that powerpc mb(), which is (hw)sync, is also not applicable there. I believe that it needs sync+eieio.
This is IMO still an argument to only do this on the architectures where you know what it does, exactly.
Made it MD.
Regarding mb(), I was expecting a discussion rather to use stronger one, not finer-grained. Here we need to guarantee that mb is valid for every domain (inner+outter shareable) , thus use a full-system one. I was wondering, however, if the dsb() is not more appropriate. Mostly because of scenario what if we're syncing a buffer containing executable code which we jumps to. In that case mb might not be sufficient. I did not find any such place in the kernel, so decided to use less impacting mb instead.
Whatever you decide about the location of SYNC_DEFAULT, I am fine with the change.
sys/sys/bus_dma.h | ||
---|---|---|
292 ↗ | (On Diff #6772) | Could the BUS_DMAMAP_SYNC_DEFAULT macro moved to the arch/include/bus_dma.h ? |
If we need to use a full system barrier on ARM we should use it explicitly. I'm planning on changing mb to not be a full system barrier.