Commit Graph

3 Commits

Author SHA1 Message Date
yuripv
b78b1e2544 Mark "private use area" characters as printable.
At least some of the characters in E000-F8FF range are used by Powerline
fonts, and having no attributes for these ranges in UnicodeData.txt
other than "Other, Private Use" it should be safe to mark all of them as
printable.  Some actually were before r340491, so this fixes the
regression introduced there as well.

PR:		240911
Reviewed by:	bapt
Tested by:	Daniel Ponte <amigan@gmail.com>
Differential Revision:	https://reviews.freebsd.org/D21850
2019-10-05 22:17:54 +00:00
yuripv
c6e4d24106 Use UnicodeData.txt to create UTF-8 ctype map.
This should provide more complete coverage of currently defined Unicode
characters as compared to manually assembled one we use currently.

Comparison of original and new UTF-8 ctype maps by character class:

TYPE    ORIG    NEW
alnum   94229   126029
alpha   93557   125419
blank   4       2
cntrl   73      137685
digit   469     622
graph   109615  137203
lower   1478    2145
print   109641  137222
punct   3428    797
rune    110481  274907
space   33      24
upper   983     1781
xdigit  469     622

Large number of added cntrl definitions is due to the fact that private-use
planes are currently defined as such, this can change in the future.

Discussed with:	bapt
Approved by:	kib (mentor, implicit)
MFC after:	1 month
Differential revision:	https://reviews.freebsd.org/D17842
2018-11-17 10:36:00 +00:00
yuripv
b6fca3ee80 Add hybrid C.UTF-8 locale being identical to default C locale except
that it uses the same ctype maps and functions as other UTF-8 locales.

Reviewed by:	bapt, cem, eadler
Approved by:	kib (mentor, implicit)
Differential Revision:	https://reviews.freebsd.org/D17833
2018-11-04 22:13:22 +00:00