Page MenuHomeFreeBSD

PR231965: make localedef output locale data in target endian order
ClosedPublic

Authored by yuripv on Oct 18 2018, 12:13 AM.
Tags
None
Referenced Files
Unknown Object (File)
Wed, May 8, 4:15 PM
Unknown Object (File)
Wed, May 8, 1:44 AM
Unknown Object (File)
Tue, Apr 30, 6:51 PM
Unknown Object (File)
Tue, Apr 30, 1:22 AM
Unknown Object (File)
Apr 14 2024, 10:58 PM
Unknown Object (File)
Apr 14 2024, 10:58 PM
Unknown Object (File)
Apr 14 2024, 10:58 PM
Unknown Object (File)
Apr 12 2024, 9:58 PM

Details

Summary

PR231695 describes the issue with locale data built on LE system (amd64) when used on BE system (powerpc).

This essentially reverts rS308170 (but it made the task of finding the fields to be converted much easier, thanks!) and:

  • add -b/-l flags to localedef to specify big/little endian order for output (looked up in cap_mkdb and modeled after it)
  • make share/ctypedef and share/colldef use the new flag
  • add localedef to targets/pseudo/userland/Makefile.depend so that we use the binary we built during boostrap, and not the one from the build system itself
Test Plan

Locale data for en_US.UTF-8 and libc from cross-compile on amd64 system was tested on ref12-ppc64 to run tmux (it does work now).

Locale data built natively on powerpc64 matches the amd64 cross-built one.

Diff Detail

Repository
rS FreeBSD src repository - subversion
Lint
Lint Not Applicable
Unit
Tests Not Applicable

Event Timeline

yuripv retitled this revision from PR231965: embed endian info in locale data files magic to PR231965: make localedef output locale data in target endian order.
yuripv edited the summary of this revision. (Show Details)
yuripv edited the test plan for this revision. (Show Details)
yuripv added a reviewer: bapt.
yuripv edited subscribers, added: kib, br; removed: Contributor Reviews (src).

complete rework

I'll apply this to the fbsd cluster build.

The build host is amd64 and builds multiple targets, aarch64, i386, powerpc64 and amd64. What information would you like from the builds?

I'll apply this to the fbsd cluster build.

The build host is amd64 and builds multiple targets, aarch64, i386, powerpc64 and amd64. What information would you like from the builds?

I'd really like to compare results from native EB (e.g. powerpc64) build for /usr/share/locale with what amd64 cross-build produces.

I'll apply this to the fbsd cluster build.

The build host is amd64 and builds multiple targets, aarch64, i386, powerpc64 and amd64. What information would you like from the builds?

I'd really like to compare results from native EB (e.g. powerpc64) build for /usr/share/locale with what amd64 cross-build produces.

Ok. I'll update ref12-ppc64 to this review (product of cross build) and then you can natively do a buildworld there to see what it looks like?

I'll apply this to the fbsd cluster build.

The build host is amd64 and builds multiple targets, aarch64, i386, powerpc64 and amd64. What information would you like from the builds?

I'd really like to compare results from native EB (e.g. powerpc64) build for /usr/share/locale with what amd64 cross-build produces.

Ok. I'll update ref12-ppc64 to this review (product of cross build) and then you can natively do a buildworld there to see what it looks like?

Thank you!

I'll apply this to the fbsd cluster build.

The build host is amd64 and builds multiple targets, aarch64, i386, powerpc64 and amd64. What information would you like from the builds?

I'd really like to compare results from native EB (e.g. powerpc64) build for /usr/share/locale with what amd64 cross-build produces.

Ok. I'll update ref12-ppc64 to this review (product of cross build) and then you can natively do a buildworld there to see what it looks like?

Thank you!

Ok, I've only updated the ref12-ppc64 jail and I zfs snapshotted it as well in case we want to to rollback. Take a look around and see if it seems "fine"

I'll apply this to the fbsd cluster build.

The build host is amd64 and builds multiple targets, aarch64, i386, powerpc64 and amd64. What information would you like from the builds?

I'd really like to compare results from native EB (e.g. powerpc64) build for /usr/share/locale with what amd64 cross-build produces.

Ok. I'll update ref12-ppc64 to this review (product of cross build) and then you can natively do a buildworld there to see what it looks like?

Thank you!

Ok, I've only updated the ref12-ppc64 jail and I zfs snapshotted it as well in case we want to to rollback. Take a look around and see if it seems "fine"

Yes, it seems to be fine (tmux works now, as well as basic collation test program).

I'll apply this to the fbsd cluster build.

The build host is amd64 and builds multiple targets, aarch64, i386, powerpc64 and amd64. What information would you like from the builds?

I'd really like to compare results from native EB (e.g. powerpc64) build for /usr/share/locale with what amd64 cross-build produces.

Ok. I'll update ref12-ppc64 to this review (product of cross build) and then you can natively do a buildworld there to see what it looks like?

Thank you!

Ok, I've only updated the ref12-ppc64 jail and I zfs snapshotted it as well in case we want to to rollback. Take a look around and see if it seems "fine"

Yes, it seems to be fine (tmux works now, as well as basic collation test program).

Natively built locale data shows some differences with cross-built one -- some of the collation files do differ, so this needs more work.

yuripv edited the test plan for this revision. (Show Details)

issue fixed (wchar_t, that is (unsigned) int, is *4* bytes).

Now the amd64 cross-built locale data matches the one built natively on powerpc64.

0mp added a subscriber: 0mp.

OK from manpages.

This revision is now accepted and ready to land.Oct 19 2018, 8:40 AM

issue fixed (wchar_t, that is (unsigned) int, is *4* bytes).

Now the amd64 cross-built locale data matches the one built natively on powerpc64.

Updating ref12-ppc64 for verification.

issue fixed (wchar_t, that is (unsigned) int, is *4* bytes).

Now the amd64 cross-built locale data matches the one built natively on powerpc64.

Updating ref12-ppc64 for verification.

Allrighty, updated ref12-ppc64. Take a look around and see if the cross builded env I've installed looks good. tmux works, so it looks good to me.

FWIW I have a small preference for maintaining the approach taken in rS308170 but extending/fixing it for all archs - have the on-disk data in a consistent endianness. See the progression of pwd_mkdb ending with commits rS332902, rS333133 (D15144).

FWIW I have a small preference for maintaining the approach taken in rS308170 but extending/fixing it for all archs - have the on-disk data in a consistent endianness. See the progression of pwd_mkdb ending with commits rS332902, rS333133 (D15144).

Yes, that was one of the options I considered, though there are some differences with password database here which IMO make it less useful for locale data:

  1. unlike the password databases that can be copied to different endian system if needed, locale data can be considered static, built one time and the need for this specific data files to be used on other system is highly unlikely
  2. performance concerns raised on hackers@ -- I did NOT do any performance testing here (mostly as I switched to current implementation because of point 1), but locale data is big and is read A LOT.

For the moment, I wanted to fix current process without any real changes to libc (revert does not count :) ). If needed, I can skip documenting the -b/-l flags so we could remove them later if we decide to use consistent-endian locale data. In any case, localedef changes done here will be required for that option as well.

In any case, I'm open to hearing out opinions and change requests.

@sbruno asked on IRC for me to clarify, so - I don't object to this change, it is definitely a fix and I'd be happy to see this fixed vs status quo. Also, we have strong arguments for being able to share pwd files across machines (potentially of different endianness) that might not apply to locale data.

But as a general rule I'd prefer we use a common on-disk endianness for binary data files. @br perhaps you can weigh in on the design choice?

Our comments crossed paths.

For the moment, I wanted to fix current process without any real changes to libc (revert does not count :) ).

Ok, your argument is sufficiently convincing to me.

If needed, I can skip documenting the -b/-l flags so we could remove them later if we decide to use consistent-endian locale data.

No, go ahead and document them. If we do migrate to consistent-endian data we can follow the pwd_mkdb path to remove them.

This revision was automatically updated to reflect the committed changes.