freebsd-dev/share/i18n/csmapper/APPLE/HEBREW%UCS.src
Gabor Kovesdan ad30f8e79b Add the BSD-licensed Citrus iconv to the base system with default off
setting. It can be built by setting the WITH_ICONV knob. While this
knob is unset, the library part, the binaries, the header file and
the metadata files will not be built or installed so it makes no impact
on the system if left turned off.

This work is based on the iconv implementation in NetBSD but a great
number of improvements and feature additions have been included:

- Some utilities have been added. There is a conversion table generator,
  which can compare conversion tables to reference data generated by
  GNU libiconv. This helps ensuring conversion compatibility.
- UTF-16 surrogate support and some endianness issues have been fixed.
- The rather chaotic Makefiles to build metadata have been refactored
  and cleaned up, now it is easy to read and it is also easier to add
  support for new encodings.
- A bunch of new encodings and encoding aliases have been added.
- Support for 1->2, 1->3 and 1->4 mappings, which is needed for
  transliterating with flying accents as GNU does, like "u.
- Lots of warnings have been fixed, the major part of the code is
  now WARNS=6 clean.
- New section 1 and section 5 manual pages have been added.
- Some GNU-specific calls have been implemented:
  iconvlist(), iconvctl(), iconv_canonicalize(), iconv_open_into()
- Support for GNU's //IGNORE suffix has been added.
- The "-" argument for stdin is now recognized in iconv(1) as per POSIX.
- The Big5 conversion module has been fixed.
- The iconv.h header files is supposed to be compatible with the
  GNU version, i.e. sources should build with base iconv.h and
  GNU libiconv. It also includes a macro magic to deal with the
  char ** and const char ** incompatibility.
- GNU compatibility: "" or "char" means the current local
  encoding in use
- Various cleanups and style(9) fixes.

Approved by:	delphij (mentor)
Obtained from:	The NetBSD Project
Sponsored by:	Google Summer of Code 2009
2011-02-25 00:04:39 +00:00

518 lines
23 KiB
Plaintext

# $FreeBSD$
TYPE ROWCOL
NAME HEBREW/UCS
SRC_ZONE 0x00-0xFF
OOB_MODE ILSEQ
DST_ILSEQ 0xFFFE
DST_UNIT_BITS 16
BEGIN_MAP
#=======================================================================
# File name: HEBREW.TXT
#
# Contents: Map (external version) from Mac OS Hebrew
# character set to Unicode 2.1 and later.
#
# Copyright: (c) 1995-2002, 2005 by Apple Computer, Inc., all rights
# reserved.
#
# Contact: charsets@apple.com
#
# Changes:
#
# c02 2005-Apr-05 Update header comments; add section on
# roundtrip considerations. Matches internal
# xml <c1.4> and Text Encoding Converter 2.0.
# b3,c1 2002-Dec-19 Don't require left-right context for digits
# 0x30-0x39. Change mapping of 0x81 to use
# decomposition. Reverse the mappings of 0xA8,
# 0xA9. Update URLs, notes. Matches internal
# utom<b7>.
# b02 1999-Sep-22 Update contact e-mail address. Matches
# internal utom<b1>, ufrm<b1>, and Text
# Encoding Converter version 1.5.
# n03 1998-Feb-05 Show required Unicode character
# directionality in a different way. Update
# mappings for 0xC0 and 0xDE to use
# transcoding hints; matches internal utom<n6>,
# ufrm<n20>, and Text Encoding Converter
# version 1.3. Rewrite header comments.
# n01 1995-Nov-15 First version. Matches internal ufrm<n8>.
#
# Standard header:
# ----------------
#
# Apple, the Apple logo, and Macintosh are trademarks of Apple
# Computer, Inc., registered in the United States and other countries.
# Unicode is a trademark of Unicode Inc. For the sake of brevity,
# throughout this document, "Macintosh" can be used to refer to
# Macintosh computers and "Unicode" can be used to refer to the
# Unicode standard.
#
# Apple Computer, Inc. ("Apple") makes no warranty or representation,
# either express or implied, with respect to this document and the
# included data, its quality, accuracy, or fitness for a particular
# purpose. In no event will Apple be liable for direct, indirect,
# special, incidental, or consequential damages resulting from any
# defect or inaccuracy in this document or the included data.
#
# These mapping tables and character lists are subject to change.
# The latest tables should be available from the following:
#
# <http://www.unicode.org/Public/MAPPINGS/VENDORS/APPLE/>
#
# For general information about Mac OS encodings and these mapping
# tables, see the file "README.TXT".
#
# Format:
# -------
#
# Three tab-separated columns;
# '#' begins a comment which continues to the end of the line.
# Column #1 is the Mac OS Hebrew code (in hex as 0xNN).
# Column #2 is the corresponding Unicode or Unicode sequence (in
# hex as 0xNNNN, 0xNNNN+0xNNNN, etc.). Sequences of up to 3
# Unicode characters are used here. A single Unicode character
# may be preceded by a tag indicating required directionality
# (i.e. 0xNNNN or 0xNNNN).
# Column #3 is a comment containing the Unicode name.
#
# The entries are in Mac OS Hebrew code order.
#
# Some of these mappings require the use of corporate characters.
# See the file "CORPCHAR.TXT" and notes below.
#
# Control character mappings are not shown in this table, following
# the conventions of the standard UTC mapping tables. However, the
# Mac OS Hebrew character set uses the standard control characters at
# 0x00-0x1F and 0x7F.
#
# Notes on Mac OS Hebrew:
# -----------------------
#
# This is a legacy Mac OS encoding; in the Mac OS X Carbon and Cocoa
# environments, it is only supported via transcoding to and from
# Unicode.
#
# 1. General
#
# The Mac OS Hebrew character set supports the Hebrew and Yiddish
# languages. It incorporates the Hebrew letter repertoire of
# ISO 8859-8, and uses the same code points for them, 0xE0-0xFA.
# It also incorporates the ASCII character set. In addition, the
# Mac OS Hebrew character set includes the following:
#
# - Hebrew points (nikud marks) at 0xC6, 0xCB-0xCF and 0xD8-0xDF.
# These are non-spacing combining marks. Note that the RAFE point
# at 0xD8 is not displayed correctly in some fonts, and cannot be
# typed using the keyboard layouts in the current Hebrew localized
# systems. Also note: The character given in Unicode as QAMATS
# (U+05B8) actually refers to two different sounds, depending on
# context. For example, when ALEF is followed by QAMATS, the QAMATS
# can actually refer to two different sounds depending on the
# following letters. The Mac OS Hebrew character set separately
# encodes these two sounds for the same graphic shape, as "qamats"
# (0xCB) and "qamats qatan" (0xDE). The "qamats" character is more
# common, so it is mapped to the Unicode QAMATS; "qamats qatan" can
# only be used with a limited number of characters, and it is
# mapped using a corporate-zone variant tag (see below).
#
# - Various Hebrew ligatures at 0x81, 0xC0, 0xC7, 0xC8, 0xD6, and
# 0xD7. Also note that the Yiddish YOD YOD PATAH ligature at 0x81
# is missing in some fonts.
#
# - The NEW SHEQEL SIGN at 0xA6.
#
# - Latin characters with diacritics at 0x80 and 0x82-0x9F. However,
# most of these cannot be typed using the keyboard layouts in the
# Hebrew localized systems.
#
# - Right-left versions of certain ASCII punctuation, symbols and
# digits: 0xA0-0xA5, 0xA7-0xBF, 0xFB-0xFF. See below.
#
# - Miscellaneous additional punctuation at 0xC1, 0xC9, 0xCA, and
# 0xD0-0xD5. There is a variant of the Hebrew encoding in which
# the LEFT SINGLE QUOTATION MARK at 0xD4 is replaced by FIGURE
# SPACE. The glyphs for some of the other punctuation characters
# are missing in some fonts.
#
# - Four obsolete characters at 0xC2-0xC5 known as canorals (not to
# be confused with cantillation marks!). These were used for
# manual positioning of nikud marks before System 7.1 (at which
# point nikud positioning became automatic with WorldScript.).
#
# 2. Directional characters and roundtrip fidelity
#
# The Mac OS Hebrew character set was developed around 1987. At that
# time the bidirectional line line layout algorithm used in the Mac OS
# Hebrew system was fairly simple; it used only a few direction
# classes (instead of the 19 now used in the Unicode bidirectional
# algorithm). In order to permit users to handle some tricky layou
# problems, certain punctuation, symbol, and digit characters have
# duplicate code points, one with a left-right direction attribute and
# the other with a right-left direction attribute.
#
# For example, plus sign is encoded at 0x2B with a left-right
# attribute, and at 0xAB with a right-left attribute. However, there
# is only one PLUS SIGN character in Unicode. This leads to some
# interesting problems when mapping between Mac OS Hebrew and Unicode;
# see below.
#
# A related problem is that even when a particular character is
# encoded only once in Mac OS Hebrew, it may have a different
# direction attribute than the corresponding Unicode character.
#
# For example, the Mac OS Hebrew character at 0xC9 is HORIZONTAL
# ELLIPSIS with strong right-left direction. However, the Unicode
# character HORIZONTAL ELLIPSIS has direction class neutral.
#
# 3. Font variants
#
# The table in this file gives the Unicode mappings for the standard
# Mac OS Hebrew encoding. This encoding is supported by many of the
# Apple fonts (including all of the fonts in the Hebrew Language Kit),
# and is the encoding supported by the text processing utilities.
# However, some TrueType fonts provided with the localized Hebrew
# system implement a slightly different encoding; the difference is
# only in one code point, 0xD4. For the standard variant, this is:
# 0xD4 -> 0x2018 LEFT SINGLE QUOTATION MARK, right-left
#
# The TrueType variant is used by the following TrueType fonts from
# the localized system: Caesarea, Carmel Book, Gilboa, Ramat Sharon,
# and Sinai Book. For these, 0xD4 is as follows:
# 0xD4 -> 0x2007 FIGURE SPACE, right-left
#
# Unicode mapping issues and notes:
# ---------------------------------
#
# 1. Matching the direction of Mac OS Hebrew characters
#
# When Mac OS Hebrew encodes a character twice but with different
# direction attributes for the two code points - as in the case of
# plus sign mentioned above - we need a way to map both Mac OS Hebrew
# code points to Unicode and back again without loss of information.
# With the plus sign, for example, mapping one of the Mac OS Hebrew
# characters to a code in the Unicode corporate use zone is
# undesirable, since both of the plus sign characters are likely to
# be used in text that is interchanged.
#
# The problem is solved with the use of direction override characters
# and direction-dependent mappings. When mapping from Mac OS Hebrew
# to Unicode, we use direction overrides as necessary to force the
# direction of the resulting Unicode characters.
#
# The required direction is indicated by a direction tag in the
# mappings. A tag of <LR> means the corresponding Unicode character
# must have a strong left-right context, and a tag of <RL> indicates
# a right-left context.
#
# For example, the mapping of 0x2B is given as 0x002B; the
# mapping of 0xAB is given as 0x002B. If we map an isolated
# instance of 0x2B to Unicode, it should be mapped as follows (LRO
# indicates LEFT-RIGHT OVERRIDE, PDF indicates POP DIRECTION
# FORMATTING):
#
# 0x2B -> 0x202D (LRO) + 0x002B (PLUS SIGN) + 0x202C (PDF)
#
# When mapping several characters in a row that require direction
# forcing, the overrides need only be used at the beginning and end.
# For example:
#
# 0x24 0x20 0x28 0x29 -> 0x202D 0x0024 0x0020 0x0028 0x0029 0x202C
#
# If neutral characters that require direction forcing are already
# between strong-direction characters with matching directionality,
# then direction overrides need not be used. Direction overrides are
# always needed to map the right-left digits at 0xB0-0xB9.
#
# When mapping from Unicode to Mac OS Hebrew, the Unicode
# bidirectional algorithm should be used to determine resolved
# direction of the Unicode characters. The mapping from Unicode to
# Mac OS Hebrew can then be disambiguated by the use of the resolved
# direction:
#
# Unicode 0x002B -> Mac OS Hebrew 0x2B (if L) or 0xAB (if R)
#
# However, this also means the direction override characters should
# be discarded when mapping from Unicode to Mac OS Hebrew (after
# they have been used to determine resolved direction), since the
# direction override information is carried by the code point itself.
#
# Even when direction overrides are not needed for roundtrip
# fidelity, they are sometimes used when mapping Mac OS Hebrew
# characters to Unicode in order to achieve similar text layout with
# the resulting Unicode text. For example, the single Mac OS Hebrew
# ellipsis character has direction class right-left,and there is no
# left-right version. However, the Unicode HORIZONTAL ELLIPSIS
# character has direction class neutral (which means it may end up
# with a resolved direction of left-right if surrounded by left-right
# characters). When mapping the Mac OS Hebrew ellipsis to Unicode, it
# is surrounded with a direction override to help preserve proper
# text layout. The resolved direction is not needed or used when
# mapping the Unicode HORIZONTAL ELLIPSIS back to Mac OS Hebrew.
#
# 2. Use of corporate-zone Unicodes
#
# The goals in the mappings provided here are:
# - Ensure roundtrip mapping from every character in the Mac OS
# Hebrew character set to Unicode and back
# - Use standard Unicode characters as much as possible, to
# maximize interchangeability of the resulting Unicode text.
# Whenever possible, avoid having content carried by private-use
# characters.
#
# Some of the characters in the Mac OS Hebrew character set do not
# correspond to distinct, single Unicode characters. To map these
# and satisfy both goals above, we employ various strategies.
#
# a) If possible, use private use characters in combination with
# standard Unicode characters to mark variants of the standard
# Unicode character.
#
# Apple has defined a block of 32 corporate characters as "transcoding
# hints." These are used in combination with standard Unicode characters
# to force them to be treated in a special way for mapping to other
# encodings; they have no other effect. Sixteen of these transcoding
# hints are "grouping hints" - they indicate that the next 2-4 Unicode
# characters should be treated as a single entity for transcoding. The
# other sixteen transcoding hints are "variant tags" - they are like
# combining characters, and can follow a standard Unicode (or a sequence
# consisting of a base character and other combining characters) to
# cause it to be treated in a special way for transcoding. These always
# terminate a combining-character sequence.
#
# Two transcoding hints are used in this mapping table: a grouping hint
# and a variant tag:
# hint:
# 0xF86A group next 2 characters, right-left directionality
# 0xF87F variant tag
#
# In Mac OS Hebrew, 0xC0 is a ligature for lamed holam. This can also
# be represented in Mac OS Hebrew as 0xEC+0xDD, using separate
# characters for lamed and holam. The latter sequence is mapped to
# Unicode as 0x05DC+0x05B9, i.e. as the sequence HEBREW LETTER LAMED +
# HEBREW POINT HOLAM. We want to map the ligature 0xC0 using the same
# standard Unicode characters, but for round-trip fidelity we need to
# distinguish it from the mapping of the sequence 0xEC+0xDD. Thus for
# 0xC0 we use a grouping hint, and map as follows:
#
# 0xC0 -> 0xF86A+0x05DC+0x05B9
#
# The variant tag is used for "qamats qatan" to mark it as an alternate
# for HEBREW POINT QAMATS, as follows:
#
# 0xDE -> 0x05B8+0xF87F
#
# b) Otherwise, use private use characters by themselves to map Mac OS
# Hebrew characters which have no relationship to any standard Unicode
# character.
#
# The following additional corporate zone Unicode characters are used
# for this purpose here (to map the obsolete "canorals", see above):
#
# 0xF89B Hebrew canoral 1
# 0xF89C Hebrew canoral 2
# 0xF89D Hebrew canoral 3
# 0xF89E Hebrew canoral 4
#
# 3. Roundtrip considerations when mapping to decomposed Unicode
#
# Both Mac OS Hebrew and Unicode provide multiple ways of representing
# certain letter-and-point combinations. For example, HEBREW LETTER
# VAV WITH HOLAM can be represented in Unicode as the single character
# 0xFB4B or as the sequence 0x05D5 0x05B9; similarly, it can be
# represented in Mac OS Hebrew as 0xC7 or as the sequence 0xE5 0xDD.
# This leads to some roundtrip problems. First note that we have the
# following mappings without such problems:
#
# Mac standard decomp. of reverse map
# OS Unicode mapping std. mapping of decomp.
# ---- ---------------------------------- ------------- -----------
# 0xC6 0x05BC ... POINT DAGESH OR MAPIQ 0x05BC (same) 0xC6
# 0xE5 0x05D5 ... LETTER VAV 0x05D5 (same) 0xE5
# 0xDD 0x05B9 ... POINT HOLAM 0x05B9 (same) 0xDD
#
# However, those mappings above cause roundtrip problems for the
# the following mappings if they are decomposed:
#
# Mac standard decomp. of reverse map
# OS Unicode mapping std. mapping of decomp.
# ---- ---------------------------------- ------------- -----------
# 0xC7 0xFB4B ... LETTER VAV WITH HOLAM 0x05D5 0x05B9 0xE5 0xDD
# 0xC8 0xFB35 ... LETTER VAV WITH DAGESH 0x05D5 0x05BC 0xE5 0xC6
#
# One solution is to use a grouping transcoding hint with the two
# decompositions above to mark the decomposed sequence for special
# treatment in transcoding. This yields the following mappings to
# decomposed Unicode:
#
# Mac decomposed
# OS Unicode mapping
# ---- --------------------
# 0xC7 0xF86A 0x05D5 0x05B9
# 0xC8 0xF86A 0x05D5 0x05BC
#
# Details of mapping changes in each version:
# -------------------------------------------
#
# Changes from version b02 to version b03/c01:
#
# - Stop specifying left-right context for digits 0x30-0x39, since the
# corresponding Unicodes 0x0030-0x0039 already have left-right
# directionality.
#
# - Change mapping of 0x81 from 0xFB1F HEBREW LIGATURE YIDDISH YOD YOD
# PATAH to its canonical decomposition 0x05F2+0x05B7 to improve
# cross-platform compatibility (Windows doesn't handle 0xFB1F)
#
# - Interchange the mappings of 0xA8 and 0xA9 to obtain the correct
# open/close behavior; they work differently than in Mac Arabic.
# The old mapping was
# 0xA8 0x0028 # LEFT PARENTHESIS, right-left
# 0xA9 0x0029 # RIGHT PARENTHESIS, right-left
# and the new mapping is
# 0xA8 0x0029 # RIGHT PARENTHESIS, right-left
# 0xA9 0x0028 # LEFT PARENTHESIS, right-left
#
# Changes from version n01 to version n03:
#
# - Change mapping for 0xC0 from single corporate character to
# grouping hint plus standard Unicodes
#
# - Change mapping for 0xDE from single corporate character to
# standard Unicode plus variant tag
#
##################
0x00 - 0x7F = 0x0000 -
0x80 = 0x00C4 # LATIN CAPITAL LETTER A WITH DIAERESIS
0x81 = 0xFB1F # 0x05F2+0x05B7 # HEBREW LIGATURE YIDDISH YOD YOD PATAH
0x82 = 0x00C7 # LATIN CAPITAL LETTER C WITH CEDILLA
0x83 = 0x00C9 # LATIN CAPITAL LETTER E WITH ACUTE
0x84 = 0x00D1 # LATIN CAPITAL LETTER N WITH TILDE
0x85 = 0x00D6 # LATIN CAPITAL LETTER O WITH DIAERESIS
0x86 = 0x00DC # LATIN CAPITAL LETTER U WITH DIAERESIS
0x87 = 0x00E1 # LATIN SMALL LETTER A WITH ACUTE
0x88 = 0x00E0 # LATIN SMALL LETTER A WITH GRAVE
0x89 = 0x00E2 # LATIN SMALL LETTER A WITH CIRCUMFLEX
0x8A = 0x00E4 # LATIN SMALL LETTER A WITH DIAERESIS
0x8B = 0x00E3 # LATIN SMALL LETTER A WITH TILDE
0x8C = 0x00E5 # LATIN SMALL LETTER A WITH RING ABOVE
0x8D = 0x00E7 # LATIN SMALL LETTER C WITH CEDILLA
0x8E = 0x00E9 # LATIN SMALL LETTER E WITH ACUTE
0x8F = 0x00E8 # LATIN SMALL LETTER E WITH GRAVE
0x90 = 0x00EA # LATIN SMALL LETTER E WITH CIRCUMFLEX
0x91 = 0x00EB # LATIN SMALL LETTER E WITH DIAERESIS
0x92 = 0x00ED # LATIN SMALL LETTER I WITH ACUTE
0x93 = 0x00EC # LATIN SMALL LETTER I WITH GRAVE
0x94 = 0x00EE # LATIN SMALL LETTER I WITH CIRCUMFLEX
0x95 = 0x00EF # LATIN SMALL LETTER I WITH DIAERESIS
0x96 = 0x00F1 # LATIN SMALL LETTER N WITH TILDE
0x97 = 0x00F3 # LATIN SMALL LETTER O WITH ACUTE
0x98 = 0x00F2 # LATIN SMALL LETTER O WITH GRAVE
0x99 = 0x00F4 # LATIN SMALL LETTER O WITH CIRCUMFLEX
0x9A = 0x00F6 # LATIN SMALL LETTER O WITH DIAERESIS
0x9B = 0x00F5 # LATIN SMALL LETTER O WITH TILDE
0x9C = 0x00FA # LATIN SMALL LETTER U WITH ACUTE
0x9D = 0x00F9 # LATIN SMALL LETTER U WITH GRAVE
0x9E = 0x00FB # LATIN SMALL LETTER U WITH CIRCUMFLEX
0x9F = 0x00FC # LATIN SMALL LETTER U WITH DIAERESIS
0xA0 = 0x0020 # SPACE, right-left
0xA1 = 0x0021 # EXCLAMATION MARK, right-left
0xA2 = 0x0022 # QUOTATION MARK, right-left
0xA3 = 0x0023 # NUMBER SIGN, right-left
0xA4 = 0x0024 # DOLLAR SIGN, right-left
0xA5 = 0x0025 # PERCENT SIGN, right-left
0xA6 = 0x20AA # NEW SHEQEL SIGN
0xA7 = 0x0027 # APOSTROPHE, right-left
0xA8 = 0x0029 # RIGHT PARENTHESIS, right-left # close parenthesis
0xA9 = 0x0028 # LEFT PARENTHESIS, right-left # open parenthesis
0xAA = 0x002A # ASTERISK, right-left
0xAB = 0x002B # PLUS SIGN, right-left
0xAC = 0x002C # COMMA, right-left
0xAD = 0x002D # HYPHEN-MINUS, right-left
0xAE = 0x002E # FULL STOP, right-left
0xAF = 0x002F # SOLIDUS, right-left
0xB0 = 0x0030 # DIGIT ZERO, right-left (need override)
0xB1 = 0x0031 # DIGIT ONE, right-left (need override)
0xB2 = 0x0032 # DIGIT TWO, right-left (need override)
0xB3 = 0x0033 # DIGIT THREE, right-left (need override)
0xB4 = 0x0034 # DIGIT FOUR, right-left (need override)
0xB5 = 0x0035 # DIGIT FIVE, right-left (need override)
0xB6 = 0x0036 # DIGIT SIX, right-left (need override)
0xB7 = 0x0037 # DIGIT SEVEN, right-left (need override)
0xB8 = 0x0038 # DIGIT EIGHT, right-left (need override)
0xB9 = 0x0039 # DIGIT NINE, right-left (need override)
0xBA = 0x003A # COLON, right-left
0xBB = 0x003B # SEMICOLON, right-left
0xBC = 0x003C # LESS-THAN SIGN, right-left
0xBD = 0x003D # EQUALS SIGN, right-left
0xBE = 0x003E # GREATER-THAN SIGN, right-left
0xBF = 0x003F # QUESTION MARK, right-left
0xC0 = 0x05B9 # 0xF86A+0x05DC+0x05B9 # Hebrew ligature lamed holam
0xC1 = 0x201E # DOUBLE LOW-9 QUOTATION MARK, right-left
0xC2 = 0xF89B # Hebrew canoral 1
0xC3 = 0xF89C # Hebrew canoral 2
0xC4 = 0xF89D # Hebrew canoral 3
0xC5 = 0xF89E # Hebrew canoral 4
0xC6 = 0x05BC # HEBREW POINT DAGESH OR MAPIQ
0xC7 = 0xFB4B # HEBREW LETTER VAV WITH HOLAM
0xC8 = 0xFB35 # HEBREW LETTER VAV WITH DAGESH
0xC9 = 0x2026 # HORIZONTAL ELLIPSIS, right-left
0xCA = 0x00A0 # NO-BREAK SPACE, right-left
0xCB = 0x05B8 # HEBREW POINT QAMATS
0xCC = 0x05B7 # HEBREW POINT PATAH
0xCD = 0x05B5 # HEBREW POINT TSERE
0xCE = 0x05B6 # HEBREW POINT SEGOL
0xCF = 0x05B4 # HEBREW POINT HIRIQ
0xD0 = 0x2013 # EN DASH, right-left
0xD1 = 0x2014 # EM DASH, right-left
0xD2 = 0x201C # LEFT DOUBLE QUOTATION MARK, right-left
0xD3 = 0x201D # RIGHT DOUBLE QUOTATION MARK, right-left
0xD4 = 0x2018 # LEFT SINGLE QUOTATION MARK, right-left
0xD5 = 0x2019 # RIGHT SINGLE QUOTATION MARK, right-left
0xD6 = 0xFB2A # HEBREW LETTER SHIN WITH SHIN DOT
0xD7 = 0xFB2B # HEBREW LETTER SHIN WITH SIN DOT
0xD8 = 0x05BF # HEBREW POINT RAFE
0xD9 = 0x05B0 # HEBREW POINT SHEVA
0xDA = 0x05B2 # HEBREW POINT HATAF PATAH
0xDB = 0x05B1 # HEBREW POINT HATAF SEGOL
0xDC = 0x05BB # HEBREW POINT QUBUTS
0xDD = 0x05B9 # HEBREW POINT HOLAM
0xDE = 0xF87F # 0x05B8+0xF87F # HEBREW POINT QAMATS, alternate form "qamats qatan"
0xDF = 0x05B3 # HEBREW POINT HATAF QAMATS
0xE0 = 0x05D0 # HEBREW LETTER ALEF
0xE1 = 0x05D1 # HEBREW LETTER BET
0xE2 = 0x05D2 # HEBREW LETTER GIMEL
0xE3 = 0x05D3 # HEBREW LETTER DALET
0xE4 = 0x05D4 # HEBREW LETTER HE
0xE5 = 0x05D5 # HEBREW LETTER VAV
0xE6 = 0x05D6 # HEBREW LETTER ZAYIN
0xE7 = 0x05D7 # HEBREW LETTER HET
0xE8 = 0x05D8 # HEBREW LETTER TET
0xE9 = 0x05D9 # HEBREW LETTER YOD
0xEA = 0x05DA # HEBREW LETTER FINAL KAF
0xEB = 0x05DB # HEBREW LETTER KAF
0xEC = 0x05DC # HEBREW LETTER LAMED
0xED = 0x05DD # HEBREW LETTER FINAL MEM
0xEE = 0x05DE # HEBREW LETTER MEM
0xEF = 0x05DF # HEBREW LETTER FINAL NUN
0xF0 = 0x05E0 # HEBREW LETTER NUN
0xF1 = 0x05E1 # HEBREW LETTER SAMEKH
0xF2 = 0x05E2 # HEBREW LETTER AYIN
0xF3 = 0x05E3 # HEBREW LETTER FINAL PE
0xF4 = 0x05E4 # HEBREW LETTER PE
0xF5 = 0x05E5 # HEBREW LETTER FINAL TSADI
0xF6 = 0x05E6 # HEBREW LETTER TSADI
0xF7 = 0x05E7 # HEBREW LETTER QOF
0xF8 = 0x05E8 # HEBREW LETTER RESH
0xF9 = 0x05E9 # HEBREW LETTER SHIN
0xFA = 0x05EA # HEBREW LETTER TAV
0xFB = 0x007D # RIGHT CURLY BRACKET, right-left
0xFC = 0x005D # RIGHT SQUARE BRACKET, right-left
0xFD = 0x007B # LEFT CURLY BRACKET, right-left
0xFE = 0x005B # LEFT SQUARE BRACKET, right-left
0xFF = 0x007C # VERTICAL LINE, right-left
END_MAP