HomeFreeBSD

MFC r340491, r340492:

Description

MFC r340491, r340492:
Use UnicodeData.txt to create UTF-8 ctype map.

This should provide more complete coverage of currently defined Unicode
characters as compared to manually assembled one we use currently.

Comparison of original and new UTF-8 ctype maps by character class:

TYPE ORIG NEW
alnum 94229 126029
alpha 93557 125419
blank 4 2
cntrl 73 137685
digit 469 622
graph 109615 137203
lower 1478 2145
print 109641 137222
punct 3428 797
rune 110481 274907
space 33 24
upper 983 1781
xdigit 469 622

Large number of added cntrl definitions is due to the fact that private-use
planes are currently defined as such, this can change in the future.

Discussed with: bapt
Differential revision: https://reviews.freebsd.org/D17842

Details

Committed
yuripvDec 6 2018, 10:53 AM
Differential Revision
D17842: use ctype data from UnicodeData.txt
Parents
rS341628: MFC r340204:
Branches
Unknown
Tags
Unknown