- User Since
- Jan 7 2019, 7:21 PM (31 w, 5 d)
Mon, Aug 12
Thu, Aug 8
@jhibbits, this last change addresses your last comment (moving VSX code to _vsx.S files) and also adds ifunc support.
[PPC64] Optimize bcopy/memcpy/memmove
Mon, Aug 5
Commandeering to address issues and use ifunc to decide whether VSX should be used.
- Use ifunc to choose best implementation based on running system
Taking over to add ifunc support, which will make it possible to have the optimized strcpy version and avoid breaking POWER5 and earlier.
Fri, Aug 2
- [PPC64] strncpy - fix rtld crash
Thu, Aug 1
Wed, Jul 31
- [PPC64] strncpy - fix rtld-libc build issue
This last change fixes the previous "dst not always zeroed" issue and adds ifunc support to strncpy, so that the optimized version is selected on ISAs >= 2.05 while others can use the fallback implementation in C.
- [PPC64] strncpy - fix 'dst' not zeroed issue
- Initial ifunc capable strncpy implementation
Tue, Jul 30
- Add missing space
Now the merge result should match the behavior seen in upstream.
- Match upstream behavior
Mon, Jul 29
I guess it is ok now.
- Fix merge issues
Thu, Jul 25
Wed, Jul 24
- Addressed jhibbits' comments
Tue, Jul 23
Fri, Jul 19
- Fix cross build detection
- Refactored LIB32LD detection code
- Avoid using short form of -fuse-ld, that avoids issues with native builds and picking up a newer ld in PATH
- Using COMPILER_TYPE instead of LINKER_TYPE in stand/defs.mk, as an installed /usr/local/bin/ld pointing to ld.bfd can confuse LINKER_TYPE detection in this case.
- Aborting if LIB32LD is not defined when building stand. This will tipically happen when trying to build stand alone and without performing a 'make buildenv' first
Jul 17 2019
I've posted a previous diff over an already patched file, that temporarily screwed the patch/history.
- [PPC64] Implement CAS
- Skip CAS if PSL_HV bit is set on MSR
- Fix inverted tab/newline in asm
Skip CAS if PSL_HV bit is set on MSR
Jul 15 2019
I consider the strncpy implementation complete now.
Adding ifunc to select the optimized implementation, however, is still not possible, as rtld doesn't support it yet.
- Handled misalignment issues
- Improved implementation
- Now using memset() for faster zeroing of dst buffer
Jul 11 2019
Jul 10 2019
But I think MOEA64_STATS should become a kernel config option, to make it easier to enable the statistics and to have this option documented.
Jul 8 2019
Changed register convention.
Still need to investigate and fix the misaligned access issue.
- Use %r* instead of raw numbers for registers
- Addressed jhibbits' comments
Jul 4 2019
Jul 3 2019
- Address jhibbits comments
Jul 1 2019
Jun 28 2019
Jun 27 2019
Jun 19 2019
Jun 17 2019
Jun 14 2019
Jun 11 2019
Looks good, thanks for fixing this.
I've tested this on QEMU, both with and without huge pages and it worked fine.
Jun 7 2019
Jun 5 2019
May 24 2019
This is pretty old and it seems the issue that this change was supposed to fix was already fixed in a proper way quite some time ago.
The change itself LGTM.
May 22 2019
May 21 2019
LGTM. Tested on ppcdevref.
May 20 2019
@git_bdragon.rtk0.net, I haven't tested your last revision, but Alfredo and I have been using your previous revision for more than a month, both in ELFv1 environments as in ELFv2, with no issues so far, and it is essential for the ELFv2 transition on PPC.