Page MenuHomeFreeBSD

Implement workaround for Armada 38X family HW issue between CPU and devices
ClosedPublic

Authored by zbb on Mar 31 2017, 2:25 PM.

Details

Summary

There is a hardware problem between Cortex-A9 CPUs and on-chip devices
in Armada 38X SoCs that may cause hang on heavy load. This can be
however worked around by mapping all registers and PCI IO
as strongly ordered instead of device memory.

Submitted by: Zbigniew Bodek <zbb@semihalf.com>
Obtained from: Semihalf
Sponsored by: Stormshield

Diff Detail

Repository
rS FreeBSD src repository
Lint
Automatic diff as part of commit; lint not applicable.
Unit
Automatic diff as part of commit; unit tests not applicable.

Event Timeline

andrew requested changes to this revision.Mar 31 2017, 3:27 PM

This breaks a GENERIC kernel, it should be runtime detected.

This revision now requires changes to proceed.Mar 31 2017, 3:27 PM

Does GENERIC kernel support Armada 38x SoC? If not, is this really an issue here?

loos added a subscriber: loos.Mar 31 2017, 6:26 PM
mmel edited edge metadata.Apr 1 2017, 7:29 AM

Please tell me, do you really think that hellish "#ifdef SOC_xxx" style used in Marvell subdirectory is the right one?
For me, using "#ifdef SOC_xxx" is unacceptable in common code, and is deprecated in vendor subdirs.

And, in this case, right solution for this problem is pretty easy, something like:
https://github.com/strejda/tegra/commit/3b5138751ee5643992b20fcb21b280fab433bb20
(untested, you can simply put "pmap_remap_vm_attr(VM_MEMATTR_DEVICE, VM_MEMATTR_SO);" to platform_devmap_init() or so...)

Ok, agree, the ifdef is not nice. Thank you for the hint, I'll try it.

skra added a subscriber: skra.Apr 1 2017, 5:34 PM
In D10218#211540, @meloun-miracle-cz wrote:

Please tell me, do you really think that hellish "#ifdef SOC_xxx" style used in Marvell subdirectory is the right one?
For me, using "#ifdef SOC_xxx" is unacceptable in common code, and is deprecated in vendor subdirs.
And, in this case, right solution for this problem is pretty easy, something like:
https://github.com/strejda/tegra/commit/3b5138751ee5643992b20fcb21b280fab433bb20
(untested, you can simply put "pmap_remap_vm_attr(VM_MEMATTR_DEVICE, VM_MEMATTR_SO);" to platform_devmap_init() or so...)

IMO, such function will need quite good comment about when it's safe to use it and why it's safe only at that time.

andrew added a comment.Apr 1 2017, 6:25 PM
In D10218#211540, @meloun-miracle-cz wrote:

Please tell me, do you really think that hellish "#ifdef SOC_xxx" style used in Marvell subdirectory is the right one?
For me, using "#ifdef SOC_xxx" is unacceptable in common code, and is deprecated in vendor subdirs.

The SOC_XXX checks are correct when it's to enable code that will only be run on a single SoC, e.g. the platform code. Unfortunately the Marvell code uses it to change definitions and I would like to move away from that style.

And, in this case, right solution for this problem is pretty easy, something like:
https://github.com/strejda/tegra/commit/3b5138751ee5643992b20fcb21b280fab433bb20
(untested, you can simply put "pmap_remap_vm_attr(VM_MEMATTR_DEVICE, VM_MEMATTR_SO);" to platform_devmap_init() or so...)

Is the issue that PCIe expects a stronger memory model than ARM provides? It might be enough to add pmap_change_attr calls to the PCIe driver to remap the device memory with a stronger memory ordering leaving the other device memory with the weaker memory model.

mmel added a comment.Apr 2 2017, 6:47 AM
In D10218#211540, @meloun-miracle-cz wrote:

Please tell me, do you really think that hellish "#ifdef SOC_xxx" style used in Marvell subdirectory is the right one?
For me, using "#ifdef SOC_xxx" is unacceptable in common code, and is deprecated in vendor subdirs.

The SOC_XXX checks are correct when it's to enable code that will only be run on a single SoC, e.g. the platform code. Unfortunately the Marvell code uses it to change definitions and I would like to move away from that style.

Yep, I agree. Thanks for clarification.

And, in this case, right solution for this problem is pretty easy, something like:
https://github.com/strejda/tegra/commit/3b5138751ee5643992b20fcb21b280fab433bb20
(untested, you can simply put "pmap_remap_vm_attr(VM_MEMATTR_DEVICE, VM_MEMATTR_SO);" to platform_devmap_init() or so...)

Is the issue that PCIe expects a stronger memory model than ARM provides? It might be enough to add pmap_change_attr calls to the PCIe driver to remap the device memory with a stronger memory ordering leaving the other device memory with the weaker memory model.

Do you have pointer to any documentation about this?
I only very vaguely remember that I read some "hints type" paper, targeting aarch64, mentioning this problem. But I can't find nothing in the official documentation and the root of the problem is not clear to me.

andrew added a comment.Apr 2 2017, 5:46 PM
In D10218#211635, @meloun-miracle-cz wrote:
In D10218#211540, @meloun-miracle-cz wrote:

And, in this case, right solution for this problem is pretty easy, something like:
https://github.com/strejda/tegra/commit/3b5138751ee5643992b20fcb21b280fab433bb20
(untested, you can simply put "pmap_remap_vm_attr(VM_MEMATTR_DEVICE, VM_MEMATTR_SO);" to platform_devmap_init() or so...)

Is the issue that PCIe expects a stronger memory model than ARM provides? It might be enough to add pmap_change_attr calls to the PCIe driver to remap the device memory with a stronger memory ordering leaving the other device memory with the weaker memory model.

Do you have pointer to any documentation about this?
I only very vaguely remember that I read some "hints type" paper, targeting aarch64, mentioning this problem. But I can't find nothing in the official documentation and the root of the problem is not clear to me.

In the ARMv8 ARM there is the recommendation to use the no early write acknowledgement (nE) attribute with PCIe memory. ARM also documents Device-nGnRnE as "Equivalent to the Strongly-ordered memory type in earlier versions of the architecture". This is the only memory type with the nE property, device memory ordering cannot be any stronger on ARM.

zbb added a subscriber: zbb.Apr 5 2017, 10:54 AM

@meloun-miracle-cz your solution presented here https://github.com/strejda/tegra/commit/3b5138751ee5643992b20fcb21b280fab433bb20 looks very good.
It would be nice if you could commit this to the HEAD.

Hi @meloun-miracle-cz

Would it be possible that you commit change pointed by zbb?
https://github.com/strejda/tegra/commit/3b5138751ee5643992b20fcb21b280fab433bb20

mmel added a comment.May 9 2017, 11:07 AM

Committed as r318021. Sorry for huge delay...

zbb edited edge metadata.
zbb edited the summary of this revision. (Show Details)
zbb commandeered this revision.
zbb updated this revision to Diff 28246.

Reworked to use a platform call instead of a hard-coded value.

zbb added a comment.May 11 2017, 4:00 PM
In D10218#220936, @meloun-miracle-cz wrote:

Committed as r318021. Sorry for huge delay...

Thanks. Please take a look at the code now.

andrew added inline comments.May 11 2017, 5:47 PM
sys/arm/mv/mv_machdep.c
256 ↗(On Diff #28246)

Do you need to run this on all CPUs? All it does is update tex_class, then call pmap_set_tex, the latter is already run on secondary cores in init_secondary.

skra added a comment.May 11 2017, 10:03 PM

Note that pmap_remap_vm_attr() calls pmap_set_tex() which assumes that all caches are disabled (see last two lines in this function). Caches are enabled in reinit_mmu() which is called from pmap_bootstrap_prepare() and init_secondary(). It's unfortunate that Michal did not add some comment about this to pmap_remap_vm_attr(). However, it's just a note for now as use of platform_cpu_init() is not clear so far.

zbb added a comment.May 12 2017, 10:20 AM
In D10218#221537, @skra wrote:

Note that pmap_remap_vm_attr() calls pmap_set_tex() which assumes that all caches are disabled (see last two lines in this function). Caches are enabled in reinit_mmu() which is called from pmap_bootstrap_prepare() and init_secondary(). It's unfortunate that Michal did not add some comment about this to pmap_remap_vm_attr(). However, it's just a note for now as use of platform_cpu_init() is not clear so far.

Yes. This must be called where it is being called now. Unfortunately platform_ code seems to be unusable for our case.

BTW. What is unclear for you regarding the use of platform_cpu_init()? There was a need to modify the default, hard-coded settings of the mappings as well as CPUs auxiliary control register. Use of another platform callback was suggested instead of ugly ifdef or another hard-coded setting of ACTLR (that was undesirable on some Cortex revisions). That is why we created this function and we use it in the related patches (please see dependencies) to modify default CPU/MMU settings.

sys/arm/mv/mv_machdep.c
256 ↗(On Diff #28246)

No this could be run on the boot cpu only. Will change it in the next round.

In D10218#221537, @skra wrote:

Note that pmap_remap_vm_attr() calls pmap_set_tex() which assumes that all caches are disabled (see last two lines in this function).

pmap_remap_vm_attr isn't very useful if we can only call it with the cache disabled. I would expect we don't need to handle the cache for remapping device memory so this should be safe, however am not as familiar with the 32 bit ARM code as arm64.

skra added a comment.May 12 2017, 12:46 PM
In D10218#221615, @zbb wrote:
In D10218#221537, @skra wrote:

Note that pmap_remap_vm_attr() calls pmap_set_tex() which assumes that all caches are disabled (see last two lines in this function). Caches are enabled in reinit_mmu() which is called from pmap_bootstrap_prepare() and init_secondary(). It's unfortunate that Michal did not add some comment about this to pmap_remap_vm_attr(). However, it's just a note for now as use of platform_cpu_init() is not clear so far.

Yes. This must be called where it is being called now. Unfortunately platform_ code seems to be unusable for our case.
BTW. What is unclear for you regarding the use of platform_cpu_init()? There was a need to modify the default, hard-coded settings of the mappings as well as CPUs auxiliary control register. Use of another platform callback was suggested instead of ugly ifdef or another hard-coded setting of ACTLR (that was undesirable on some Cortex revisions). That is why we created this function and we use it in the related patches (please see dependencies) to modify default CPU/MMU settings.

Just wanted to say that pmap_remap_vm_attr() must be called before pmap_bootstrap_prepare(). And even if platform cpu_init() method is introduced in another place (D10682) where platform_cpu_init() is called before pmap_bootstrap_prepare(), it's used before platform code is initialized and it's wrong. So, it's unclear for now if this patch will satisfy what was noted by me.

zbb added a comment.May 12 2017, 1:20 PM
In D10218#221634, @skra wrote:
In D10218#221615, @zbb wrote:
In D10218#221537, @skra wrote:

Note that pmap_remap_vm_attr() calls pmap_set_tex() which assumes that all caches are disabled (see last two lines in this function). Caches are enabled in reinit_mmu() which is called from pmap_bootstrap_prepare() and init_secondary(). It's unfortunate that Michal did not add some comment about this to pmap_remap_vm_attr(). However, it's just a note for now as use of platform_cpu_init() is not clear so far.

Yes. This must be called where it is being called now. Unfortunately platform_ code seems to be unusable for our case.
BTW. What is unclear for you regarding the use of platform_cpu_init()? There was a need to modify the default, hard-coded settings of the mappings as well as CPUs auxiliary control register. Use of another platform callback was suggested instead of ugly ifdef or another hard-coded setting of ACTLR (that was undesirable on some Cortex revisions). That is why we created this function and we use it in the related patches (please see dependencies) to modify default CPU/MMU settings.

Just wanted to say that pmap_remap_vm_attr() must be called before pmap_bootstrap_prepare(). And even if platform cpu_init() method is introduced in another place (D10682) where platform_cpu_init() is called before pmap_bootstrap_prepare(), it's used before platform code is initialized and it's wrong. So, it's unclear for now if this patch will satisfy what was noted by me.

OK. Now I understand what you meant. Yes, I admit this is a mistake to use platform code so early. It will not work for the GENERIC kernel.

zbb added a comment.May 12 2017, 6:55 PM

@skra Why are we not able to pmap_set_tex() while caches are enabled? We are merely modifying two registers. TLB invalidation should be the only thing that is required to be done. Could you elaborate this constraint a little? It seems that this is the only thing that is holding us back from the acceptable solution so I would like to understand the issue.
Thanks in advance for your help.

skra added a comment.May 13 2017, 10:28 AM
In D10218#221730, @zbb wrote:

@skra Why are we not able to pmap_set_tex() while caches are enabled? We are merely modifying two registers. TLB invalidation should be the only thing that is required to be done. Could you elaborate this constraint a little? It seems that this is the only thing that is holding us back from the acceptable solution so I would like to understand the issue.
Thanks in advance for your help.

My note was about two aspects of pmap_remap_vm_attr().

(1) Generally, when cache is enabled, changing cache mode of memory in active use (i.e. any memory which is mapped (keep in mind speculative reads)) is a tricky thing if possible at all. It's a problem of atomicity of such action. It's also a thing of how cache itself and cache maintainance functions work. Imagine that a page is remapped from cached mode to uncached one. All cachelines associated with this page must be removed from cache. To do this while the page is mapped as cached is not enough because of speculative reads. To do this after the page is remapped will not work as the mapping is uncached and cache maintainance function may do nothing for uncached memory. Note that access to uncached memory may still go thru cache.

Some cache modes may be change to others without problem, but it's not true for every combination of cache modes. Maybe, under very strict circumstances, it could be possible to change cache mode in some memory region.

However, pmap_remap_vm_attr() works globally for every TEX class combination. It touches all memory mapped with TEXs being changed. So, my concern was to comment such function properly to specify when it's safe to use it.

(2) Code consistency. Either some explanation should have been done why pmap_set_tex() can be used in pmap_remap_vm_attr() or comment in pmap_set_tex() should have been changed.

Well, it looks that r318251 answered my note.

zbb added a comment.May 15 2017, 11:38 AM
In D10218#221808, @skra wrote:
In D10218#221730, @zbb wrote:

@skra Why are we not able to pmap_set_tex() while caches are enabled? We are merely modifying two registers. TLB invalidation should be the only thing that is required to be done. Could you elaborate this constraint a little? It seems that this is the only thing that is holding us back from the acceptable solution so I would like to understand the issue.
Thanks in advance for your help.

My note was about two aspects of pmap_remap_vm_attr().
(1) Generally, when cache is enabled, changing cache mode of memory in active use (i.e. any memory which is mapped (keep in mind speculative reads)) is a tricky thing if possible at all. It's a problem of atomicity of such action. It's also a thing of how cache itself and cache maintainance functions work. Imagine that a page is remapped from cached mode to uncached one. All cachelines associated with this page must be removed from cache. To do this while the page is mapped as cached is not enough because of speculative reads. To do this after the page is remapped will not work as the mapping is uncached and cache maintainance function may do nothing for uncached memory. Note that access to uncached memory may still go thru cache.
Some cache modes may be change to others without problem, but it's not true for every combination of cache modes. Maybe, under very strict circumstances, it could be possible to change cache mode in some memory region.
However, pmap_remap_vm_attr() works globally for every TEX class combination. It touches all memory mapped with TEXs being changed. So, my concern was to comment such function properly to specify when it's safe to use it.
(2) Code consistency. Either some explanation should have been done why pmap_set_tex() can be used in pmap_remap_vm_attr() or comment in pmap_set_tex() should have been changed.
Well, it looks that r318251 answered my note.

Thanks, this is very informative. So we can proceed with our platform change (next round).

zbb updated this revision to Diff 29715.Jun 16 2017, 1:49 PM

Use pmap_remap_vm_attr() function.

mw_semihalf.com accepted this revision.Jun 16 2017, 4:11 PM

Tested on Armada 388 Clearfog

mmel accepted this revision.Jun 17 2017, 4:11 AM
This revision was automatically updated to reflect the committed changes.