HomeFreeBSD

Use UnicodeData.txt to create UTF-8 ctype map.

Description

Use UnicodeData.txt to create UTF-8 ctype map.

This should provide more complete coverage of currently defined Unicode
characters as compared to manually assembled one we use currently.

Comparison of original and new UTF-8 ctype maps by character class:

TYPE ORIG NEW
alnum 94229 126029
alpha 93557 125419
blank 4 2
cntrl 73 137685
digit 469 622
graph 109615 137203
lower 1478 2145
print 109641 137222
punct 3428 797
rune 110481 274907
space 33 24
upper 983 1781
xdigit 469 622

Large number of added cntrl definitions is due to the fact that private-use
planes are currently defined as such, this can change in the future.

Discussed with: bapt
Approved by: kib (mentor, implicit)
MFC after: 1 month
Differential revision: https://reviews.freebsd.org/D17842

Details

Provenance
yuripvAuthored on
Differential Revision
D17842: use ctype data from UnicodeData.txt
Parents
rS340490: Fix stray tab.
Branches
Unknown
Tags
Unknown