HomeFreeBSD

sort: test against all month formats in month-sort

Description

sort: test against all month formats in month-sort

The CLDR specification [1] defines three possible month formats:

  • Abbreviation (e.g Jan, Ιαν)
  • Full (e.g January, Ιανουαρίου)
  • Standalone (e.g January, Ιανουάριος)

Many languages use different case endings depending on whether the month
is referenced as a standalone word (nominative case), or in date context
(genitive, partitive, etc.). sort(1)'s -M option currently sorts months
by testing input against only the abbrevation format, which is
essentially a substring of the full format. While this works fine for
languages like English, where there are no cases, for languages where
there is a different case ending between the abbreviation/full and
standalone formats, it is not sufficient.

For example, in Greek, "May" can take the following forms:

Abbreviation: Μαΐ (genitive case)
Full: Μαΐου (genitive case)
Standalone: Μάιος (nominative case)

If we use the standalone format in Greek, sort(1) will not able to match
"Μαΐ" to "Μάιος" and the sort will fail.

This change makes sort(1) test against all three formats. It also works
when the input contains mixed formats.

[1] https://cldr.unicode.org/translation/date-time/date-time-patterns

Reviewed by: markj
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D42847

(cherry picked from commit 3d44dce90a6946e2ef2ab30ffbf8e2930acf888b)

Details

Provenance
christosAuthored on Dec 1 2023, 12:30 AM
Reviewer
markj
Differential Revision
D42847: sort: test against all month formats in month-sort
Parents
rG9e6e28bb8ea8: Add IBM TS1170 density codes and specs.
Branches
Unknown
Tags
Unknown