Update to Unicode 11.0.0
This commit is contained in:
parent
50aa69657e
commit
937617f343
|
@ -108,6 +108,10 @@ to an incorrect "lookbehind assertion is not fixed length" error.
|
|||
23. The VERSION condition test was reading fractional PCRE2 version numbers
|
||||
such as the 04 in 10.04 incorrectly and hence giving wrong results.
|
||||
|
||||
24. Updated to Unicode version 11.0.0. As well as the usual addition of new
|
||||
scripts and characters, this involved re-jigging the grapheme break property
|
||||
algorithm because Unicode has changed the way emojis are handled.
|
||||
|
||||
|
||||
Version 10.31 12-February-2018
|
||||
------------------------------
|
||||
|
|
|
@ -789,6 +789,7 @@ Cypriot,
|
|||
Cyrillic,
|
||||
Deseret,
|
||||
Devanagari,
|
||||
Dogra,
|
||||
Duployan,
|
||||
Egyptian_Hieroglyphs,
|
||||
Elbasan,
|
||||
|
@ -799,9 +800,11 @@ Gothic,
|
|||
Grantha,
|
||||
Greek,
|
||||
Gujarati,
|
||||
Gunjala_Gondi,
|
||||
Gurmukhi,
|
||||
Han,
|
||||
Hangul,
|
||||
Hanifi_Rohingya,
|
||||
Hanunoo,
|
||||
Hatran,
|
||||
Hebrew,
|
||||
|
@ -829,11 +832,13 @@ Lisu,
|
|||
Lycian,
|
||||
Lydian,
|
||||
Mahajani,
|
||||
Makasar,
|
||||
Malayalam,
|
||||
Mandaic,
|
||||
Manichaean,
|
||||
Marchen,
|
||||
Masaram_Gondi,
|
||||
Medefaidrin,
|
||||
Meetei_Mayek,
|
||||
Mende_Kikakui,
|
||||
Meroitic_Cursive,
|
||||
|
@ -856,6 +861,7 @@ Old_Italic,
|
|||
Old_North_Arabian,
|
||||
Old_Permic,
|
||||
Old_Persian,
|
||||
Old_Sogdian,
|
||||
Old_South_Arabian,
|
||||
Old_Turkic,
|
||||
Oriya,
|
||||
|
@ -876,6 +882,7 @@ Shavian,
|
|||
Siddham,
|
||||
SignWriting,
|
||||
Sinhala,
|
||||
Sogdian,
|
||||
Sora_Sompeng,
|
||||
Soyombo,
|
||||
Sundanese,
|
||||
|
@ -1006,7 +1013,10 @@ grapheme cluster", and treats the sequence as an atomic group
|
|||
Unicode supports various kinds of composite character by giving each character
|
||||
a grapheme breaking property, and having rules that use these properties to
|
||||
define the boundaries of extended grapheme clusters. The rules are defined in
|
||||
Unicode Standard Annex 29, "Unicode Text Segmentation".
|
||||
Unicode Standard Annex 29, "Unicode Text Segmentation". Unicode 11.0.0
|
||||
abandoned the use of some previous properties that had been used for emojis.
|
||||
Instead it introduced various emoji-specific properties. PCRE2 uses only the
|
||||
Extended Pictographic property.
|
||||
</P>
|
||||
<P>
|
||||
\X always matches at least one character. Then it decides whether to add
|
||||
|
@ -1026,27 +1036,24 @@ character; an LVT or T character may be follwed only by a T character.
|
|||
</P>
|
||||
<P>
|
||||
4. Do not end before extending characters or spacing marks or the "zero-width
|
||||
joiner" characters. Characters with the "mark" property always have the
|
||||
joiner" character. Characters with the "mark" property always have the
|
||||
"extend" grapheme breaking property.
|
||||
</P>
|
||||
<P>
|
||||
5. Do not end after prepend characters.
|
||||
</P>
|
||||
<P>
|
||||
6. Do not break within emoji modifier sequences (a base character followed by a
|
||||
modifier). Extending characters are allowed before the modifier.
|
||||
6. Do not break within emoji modifier sequences or emoji zwj sequences. That
|
||||
is, do not break between characters with the Extended_Pictographic property.
|
||||
Extend and ZWJ characters are allowed between the characters.
|
||||
</P>
|
||||
<P>
|
||||
7. Do not break within emoji zwj sequences (zero-width joiner followed by
|
||||
"glue after ZWJ" or "base glue after ZWJ").
|
||||
</P>
|
||||
<P>
|
||||
8. Do not break within emoji flag sequences. That is, do not break between
|
||||
7. Do not break within emoji flag sequences. That is, do not break between
|
||||
regional indicator (RI) characters if there are an odd number of RI characters
|
||||
before the break point.
|
||||
</P>
|
||||
<P>
|
||||
6. Otherwise, end the cluster.
|
||||
8. Otherwise, end the cluster.
|
||||
<a name="extraprops"></a></P>
|
||||
<br><b>
|
||||
PCRE2's additional properties
|
||||
|
@ -3490,7 +3497,7 @@ Cambridge, England.
|
|||
</P>
|
||||
<br><a name="SEC30" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 30 June 2018
|
||||
Last updated: 07 July 2018
|
||||
<br>
|
||||
Copyright © 1997-2018 University of Cambridge.
|
||||
<br>
|
||||
|
|
|
@ -188,6 +188,7 @@ at release 5.18.
|
|||
</P>
|
||||
<br><a name="SEC7" href="#TOC1">SCRIPT NAMES FOR \p AND \P</a><br>
|
||||
<P>
|
||||
Adlam,
|
||||
Ahom,
|
||||
Anatolian_Hieroglyphs,
|
||||
Arabic,
|
||||
|
@ -198,6 +199,7 @@ Bamum,
|
|||
Bassa_Vah,
|
||||
Batak,
|
||||
Bengali,
|
||||
Bhaiksuki,
|
||||
Bopomofo,
|
||||
Brahmi,
|
||||
Braille,
|
||||
|
@ -216,6 +218,7 @@ Cypriot,
|
|||
Cyrillic,
|
||||
Deseret,
|
||||
Devanagari,
|
||||
Dogra,
|
||||
Duployan,
|
||||
Egyptian_Hieroglyphs,
|
||||
Elbasan,
|
||||
|
@ -226,9 +229,11 @@ Gothic,
|
|||
Grantha,
|
||||
Greek,
|
||||
Gujarati,
|
||||
Gunjala_Gondi,
|
||||
Gurmukhi,
|
||||
Han,
|
||||
Hangul,
|
||||
Hanifi_Rohingya,
|
||||
Hanunoo,
|
||||
Hatran,
|
||||
Hebrew,
|
||||
|
@ -256,9 +261,13 @@ Lisu,
|
|||
Lycian,
|
||||
Lydian,
|
||||
Mahajani,
|
||||
Makasar,
|
||||
Malayalam,
|
||||
Mandaic,
|
||||
Manichaean,
|
||||
Marchen,
|
||||
Masaram_Gondi,
|
||||
Medefaidrin,
|
||||
Meetei_Mayek,
|
||||
Mende_Kikakui,
|
||||
Meroitic_Cursive,
|
||||
|
@ -271,7 +280,9 @@ Multani,
|
|||
Myanmar,
|
||||
Nabataean,
|
||||
New_Tai_Lue,
|
||||
Newa,
|
||||
Nko,
|
||||
Nushu,
|
||||
Ogham,
|
||||
Ol_Chiki,
|
||||
Old_Hungarian,
|
||||
|
@ -279,9 +290,11 @@ Old_Italic,
|
|||
Old_North_Arabian,
|
||||
Old_Permic,
|
||||
Old_Persian,
|
||||
Old_Sogdian,
|
||||
Old_South_Arabian,
|
||||
Old_Turkic,
|
||||
Oriya,
|
||||
Osage,
|
||||
Osmanya,
|
||||
Pahawh_Hmong,
|
||||
Palmyrene,
|
||||
|
@ -298,7 +311,9 @@ Shavian,
|
|||
Siddham,
|
||||
SignWriting,
|
||||
Sinhala,
|
||||
Sogdian,
|
||||
Sora_Sompeng,
|
||||
Soyombo,
|
||||
Sundanese,
|
||||
Syloti_Nagri,
|
||||
Syriac,
|
||||
|
@ -309,6 +324,7 @@ Tai_Tham,
|
|||
Tai_Viet,
|
||||
Takri,
|
||||
Tamil,
|
||||
Tangut,
|
||||
Telugu,
|
||||
Thaana,
|
||||
Thai,
|
||||
|
@ -318,7 +334,8 @@ Tirhuta,
|
|||
Ugaritic,
|
||||
Vai,
|
||||
Warang_Citi,
|
||||
Yi.
|
||||
Yi,
|
||||
Zanabazar_Square.
|
||||
</P>
|
||||
<br><a name="SEC8" href="#TOC1">CHARACTER CLASSES</a><br>
|
||||
<P>
|
||||
|
@ -600,7 +617,7 @@ Cambridge, England.
|
|||
</P>
|
||||
<br><a name="SEC27" href="#TOC1">REVISION</a><br>
|
||||
<P>
|
||||
Last updated: 28 June 2018
|
||||
Last updated: 07 July 2018
|
||||
<br>
|
||||
Copyright © 1997-2018 University of Cambridge.
|
||||
<br>
|
||||
|
|
183
doc/pcre2.txt
183
doc/pcre2.txt
|
@ -6483,34 +6483,35 @@ BACKSLASH
|
|||
nese, Bamum, Bassa_Vah, Batak, Bengali, Bhaiksuki, Bopomofo, Brahmi,
|
||||
Braille, Buginese, Buhid, Canadian_Aboriginal, Carian, Caucasian_Alba-
|
||||
nian, Chakma, Cham, Cherokee, Common, Coptic, Cuneiform, Cypriot,
|
||||
Cyrillic, Deseret, Devanagari, Duployan, Egyptian_Hieroglyphs, Elbasan,
|
||||
Ethiopic, Georgian, Glagolitic, Gothic, Grantha, Greek, Gujarati, Gur-
|
||||
mukhi, Han, Hangul, Hanunoo, Hatran, Hebrew, Hiragana, Imperial_Ara-
|
||||
maic, Inherited, Inscriptional_Pahlavi, Inscriptional_Parthian,
|
||||
Javanese, Kaithi, Kannada, Katakana, Kayah_Li, Kharoshthi, Khmer, Kho-
|
||||
jki, Khudawadi, Lao, Latin, Lepcha, Limbu, Linear_A, Linear_B, Lisu,
|
||||
Lycian, Lydian, Mahajani, Malayalam, Mandaic, Manichaean, Marchen,
|
||||
Masaram_Gondi, Meetei_Mayek, Mende_Kikakui, Meroitic_Cursive,
|
||||
Meroitic_Hieroglyphs, Miao, Modi, Mongolian, Mro, Multani, Myanmar,
|
||||
Nabataean, New_Tai_Lue, Newa, Nko, Nushu, Ogham, Ol_Chiki, Old_Hungar-
|
||||
ian, Old_Italic, Old_North_Arabian, Old_Permic, Old_Persian,
|
||||
Old_South_Arabian, Old_Turkic, Oriya, Osage, Osmanya, Pahawh_Hmong,
|
||||
Palmyrene, Pau_Cin_Hau, Phags_Pa, Phoenician, Psalter_Pahlavi, Rejang,
|
||||
Runic, Samaritan, Saurashtra, Sharada, Shavian, Siddham, SignWriting,
|
||||
Sinhala, Sora_Sompeng, Soyombo, Sundanese, Syloti_Nagri, Syriac, Taga-
|
||||
log, Tagbanwa, Tai_Le, Tai_Tham, Tai_Viet, Takri, Tamil, Tangut, Tel-
|
||||
ugu, Thaana, Thai, Tibetan, Tifinagh, Tirhuta, Ugaritic, Vai,
|
||||
Warang_Citi, Yi, Zanabazar_Square.
|
||||
Cyrillic, Deseret, Devanagari, Dogra, Duployan, Egyptian_Hieroglyphs,
|
||||
Elbasan, Ethiopic, Georgian, Glagolitic, Gothic, Grantha, Greek,
|
||||
Gujarati, Gunjala_Gondi, Gurmukhi, Han, Hangul, Hanifi_Rohingya,
|
||||
Hanunoo, Hatran, Hebrew, Hiragana, Imperial_Aramaic, Inherited,
|
||||
Inscriptional_Pahlavi, Inscriptional_Parthian, Javanese, Kaithi, Kan-
|
||||
nada, Katakana, Kayah_Li, Kharoshthi, Khmer, Khojki, Khudawadi, Lao,
|
||||
Latin, Lepcha, Limbu, Linear_A, Linear_B, Lisu, Lycian, Lydian, Maha-
|
||||
jani, Makasar, Malayalam, Mandaic, Manichaean, Marchen, Masaram_Gondi,
|
||||
Medefaidrin, Meetei_Mayek, Mende_Kikakui, Meroitic_Cursive,
|
||||
Meroitic_Hieroglyphs, Miao, Modi, Mongolian, Mro, Multani, Myanmar,
|
||||
Nabataean, New_Tai_Lue, Newa, Nko, Nushu, Ogham, Ol_Chiki, Old_Hungar-
|
||||
ian, Old_Italic, Old_North_Arabian, Old_Permic, Old_Persian, Old_Sog-
|
||||
dian, Old_South_Arabian, Old_Turkic, Oriya, Osage, Osmanya,
|
||||
Pahawh_Hmong, Palmyrene, Pau_Cin_Hau, Phags_Pa, Phoenician,
|
||||
Psalter_Pahlavi, Rejang, Runic, Samaritan, Saurashtra, Sharada, Sha-
|
||||
vian, Siddham, SignWriting, Sinhala, Sogdian, Sora_Sompeng, Soyombo,
|
||||
Sundanese, Syloti_Nagri, Syriac, Tagalog, Tagbanwa, Tai_Le, Tai_Tham,
|
||||
Tai_Viet, Takri, Tamil, Tangut, Telugu, Thaana, Thai, Tibetan, Tifi-
|
||||
nagh, Tirhuta, Ugaritic, Vai, Warang_Citi, Yi, Zanabazar_Square.
|
||||
|
||||
Each character has exactly one Unicode general category property, spec-
|
||||
ified by a two-letter abbreviation. For compatibility with Perl, nega-
|
||||
tion can be specified by including a circumflex between the opening
|
||||
brace and the property name. For example, \p{^Lu} is the same as
|
||||
ified by a two-letter abbreviation. For compatibility with Perl, nega-
|
||||
tion can be specified by including a circumflex between the opening
|
||||
brace and the property name. For example, \p{^Lu} is the same as
|
||||
\P{Lu}.
|
||||
|
||||
If only one letter is specified with \p or \P, it includes all the gen-
|
||||
eral category properties that start with that letter. In this case, in
|
||||
the absence of negation, the curly brackets in the escape sequence are
|
||||
eral category properties that start with that letter. In this case, in
|
||||
the absence of negation, the curly brackets in the escape sequence are
|
||||
optional; these two examples have the same effect:
|
||||
|
||||
\p{L}
|
||||
|
@ -6562,44 +6563,47 @@ BACKSLASH
|
|||
Zp Paragraph separator
|
||||
Zs Space separator
|
||||
|
||||
The special property L& is also supported: it matches a character that
|
||||
has the Lu, Ll, or Lt property, in other words, a letter that is not
|
||||
The special property L& is also supported: it matches a character that
|
||||
has the Lu, Ll, or Lt property, in other words, a letter that is not
|
||||
classified as a modifier or "other".
|
||||
|
||||
The Cs (Surrogate) property applies only to characters in the range
|
||||
U+D800 to U+DFFF. Such characters are not valid in Unicode strings and
|
||||
so cannot be tested by PCRE2, unless UTF validity checking has been
|
||||
turned off (see the discussion of PCRE2_NO_UTF_CHECK in the pcre2api
|
||||
The Cs (Surrogate) property applies only to characters in the range
|
||||
U+D800 to U+DFFF. Such characters are not valid in Unicode strings and
|
||||
so cannot be tested by PCRE2, unless UTF validity checking has been
|
||||
turned off (see the discussion of PCRE2_NO_UTF_CHECK in the pcre2api
|
||||
page). Perl does not support the Cs property.
|
||||
|
||||
The long synonyms for property names that Perl supports (such as
|
||||
\p{Letter}) are not supported by PCRE2, nor is it permitted to prefix
|
||||
The long synonyms for property names that Perl supports (such as
|
||||
\p{Letter}) are not supported by PCRE2, nor is it permitted to prefix
|
||||
any of these properties with "Is".
|
||||
|
||||
No character that is in the Unicode table has the Cn (unassigned) prop-
|
||||
erty. Instead, this property is assumed for any code point that is not
|
||||
in the Unicode table.
|
||||
|
||||
Specifying caseless matching does not affect these escape sequences.
|
||||
For example, \p{Lu} always matches only upper case letters. This is
|
||||
Specifying caseless matching does not affect these escape sequences.
|
||||
For example, \p{Lu} always matches only upper case letters. This is
|
||||
different from the behaviour of current versions of Perl.
|
||||
|
||||
Matching characters by Unicode property is not fast, because PCRE2 has
|
||||
to do a multistage table lookup in order to find a character's prop-
|
||||
Matching characters by Unicode property is not fast, because PCRE2 has
|
||||
to do a multistage table lookup in order to find a character's prop-
|
||||
erty. That is why the traditional escape sequences such as \d and \w do
|
||||
not use Unicode properties in PCRE2 by default, though you can make
|
||||
them do so by setting the PCRE2_UCP option or by starting the pattern
|
||||
not use Unicode properties in PCRE2 by default, though you can make
|
||||
them do so by setting the PCRE2_UCP option or by starting the pattern
|
||||
with (*UCP).
|
||||
|
||||
Extended grapheme clusters
|
||||
|
||||
The \X escape matches any number of Unicode characters that form an
|
||||
The \X escape matches any number of Unicode characters that form an
|
||||
"extended grapheme cluster", and treats the sequence as an atomic group
|
||||
(see below). Unicode supports various kinds of composite character by
|
||||
giving each character a grapheme breaking property, and having rules
|
||||
(see below). Unicode supports various kinds of composite character by
|
||||
giving each character a grapheme breaking property, and having rules
|
||||
that use these properties to define the boundaries of extended grapheme
|
||||
clusters. The rules are defined in Unicode Standard Annex 29, "Unicode
|
||||
Text Segmentation".
|
||||
clusters. The rules are defined in Unicode Standard Annex 29, "Unicode
|
||||
Text Segmentation". Unicode 11.0.0 abandoned the use of some previous
|
||||
properties that had been used for emojis. Instead it introduced vari-
|
||||
ous emoji-specific properties. PCRE2 uses only the Extended Picto-
|
||||
graphic property.
|
||||
|
||||
\X always matches at least one character. Then it decides whether to
|
||||
add additional characters according to the following rules for ending a
|
||||
|
@ -6617,23 +6621,21 @@ BACKSLASH
|
|||
only by a T character.
|
||||
|
||||
4. Do not end before extending characters or spacing marks or the
|
||||
"zero-width joiner" characters. Characters with the "mark" property
|
||||
"zero-width joiner" character. Characters with the "mark" property
|
||||
always have the "extend" grapheme breaking property.
|
||||
|
||||
5. Do not end after prepend characters.
|
||||
|
||||
6. Do not break within emoji modifier sequences (a base character fol-
|
||||
lowed by a modifier). Extending characters are allowed before the modi-
|
||||
fier.
|
||||
6. Do not break within emoji modifier sequences or emoji zwj sequences.
|
||||
That is, do not break between characters with the Extended_Pictographic
|
||||
property. Extend and ZWJ characters are allowed between the charac-
|
||||
ters.
|
||||
|
||||
7. Do not break within emoji zwj sequences (zero-width joiner followed
|
||||
by "glue after ZWJ" or "base glue after ZWJ").
|
||||
|
||||
8. Do not break within emoji flag sequences. That is, do not break
|
||||
7. Do not break within emoji flag sequences. That is, do not break
|
||||
between regional indicator (RI) characters if there are an odd number
|
||||
of RI characters before the break point.
|
||||
|
||||
6. Otherwise, end the cluster.
|
||||
8. Otherwise, end the cluster.
|
||||
|
||||
PCRE2's additional properties
|
||||
|
||||
|
@ -8941,7 +8943,7 @@ AUTHOR
|
|||
|
||||
REVISION
|
||||
|
||||
Last updated: 30 June 2018
|
||||
Last updated: 07 July 2018
|
||||
Copyright (c) 1997-2018 University of Cambridge.
|
||||
------------------------------------------------------------------------------
|
||||
|
||||
|
@ -9915,26 +9917,29 @@ PCRE2 SPECIAL CATEGORY PROPERTIES FOR \p and \P
|
|||
|
||||
SCRIPT NAMES FOR \p AND \P
|
||||
|
||||
Ahom, Anatolian_Hieroglyphs, Arabic, Armenian, Avestan, Balinese,
|
||||
Bamum, Bassa_Vah, Batak, Bengali, Bopomofo, Brahmi, Braille, Buginese,
|
||||
Buhid, Canadian_Aboriginal, Carian, Caucasian_Albanian, Chakma, Cham,
|
||||
Cherokee, Common, Coptic, Cuneiform, Cypriot, Cyrillic, Deseret,
|
||||
Devanagari, Duployan, Egyptian_Hieroglyphs, Elbasan, Ethiopic, Geor-
|
||||
gian, Glagolitic, Gothic, Grantha, Greek, Gujarati, Gurmukhi, Han,
|
||||
Hangul, Hanunoo, Hatran, Hebrew, Hiragana, Imperial_Aramaic, Inherited,
|
||||
Inscriptional_Pahlavi, Inscriptional_Parthian, Javanese, Kaithi, Kan-
|
||||
nada, Katakana, Kayah_Li, Kharoshthi, Khmer, Khojki, Khudawadi, Lao,
|
||||
Latin, Lepcha, Limbu, Linear_A, Linear_B, Lisu, Lycian, Lydian, Maha-
|
||||
jani, Malayalam, Mandaic, Manichaean, Meetei_Mayek, Mende_Kikakui,
|
||||
Meroitic_Cursive, Meroitic_Hieroglyphs, Miao, Modi, Mongolian, Mro,
|
||||
Multani, Myanmar, Nabataean, New_Tai_Lue, Nko, Ogham, Ol_Chiki,
|
||||
Old_Hungarian, Old_Italic, Old_North_Arabian, Old_Permic, Old_Persian,
|
||||
Old_South_Arabian, Old_Turkic, Oriya, Osmanya, Pahawh_Hmong, Palmyrene,
|
||||
Pau_Cin_Hau, Phags_Pa, Phoenician, Psalter_Pahlavi, Rejang, Runic,
|
||||
Samaritan, Saurashtra, Sharada, Shavian, Siddham, SignWriting, Sinhala,
|
||||
Sora_Sompeng, Sundanese, Syloti_Nagri, Syriac, Tagalog, Tagbanwa,
|
||||
Tai_Le, Tai_Tham, Tai_Viet, Takri, Tamil, Telugu, Thaana, Thai,
|
||||
Tibetan, Tifinagh, Tirhuta, Ugaritic, Vai, Warang_Citi, Yi.
|
||||
Adlam, Ahom, Anatolian_Hieroglyphs, Arabic, Armenian, Avestan, Bali-
|
||||
nese, Bamum, Bassa_Vah, Batak, Bengali, Bhaiksuki, Bopomofo, Brahmi,
|
||||
Braille, Buginese, Buhid, Canadian_Aboriginal, Carian, Caucasian_Alba-
|
||||
nian, Chakma, Cham, Cherokee, Common, Coptic, Cuneiform, Cypriot,
|
||||
Cyrillic, Deseret, Devanagari, Dogra, Duployan, Egyptian_Hieroglyphs,
|
||||
Elbasan, Ethiopic, Georgian, Glagolitic, Gothic, Grantha, Greek,
|
||||
Gujarati, Gunjala_Gondi, Gurmukhi, Han, Hangul, Hanifi_Rohingya,
|
||||
Hanunoo, Hatran, Hebrew, Hiragana, Imperial_Aramaic, Inherited,
|
||||
Inscriptional_Pahlavi, Inscriptional_Parthian, Javanese, Kaithi, Kan-
|
||||
nada, Katakana, Kayah_Li, Kharoshthi, Khmer, Khojki, Khudawadi, Lao,
|
||||
Latin, Lepcha, Limbu, Linear_A, Linear_B, Lisu, Lycian, Lydian, Maha-
|
||||
jani, Makasar, Malayalam, Mandaic, Manichaean, Marchen, Masaram_Gondi,
|
||||
Medefaidrin, Meetei_Mayek, Mende_Kikakui, Meroitic_Cursive,
|
||||
Meroitic_Hieroglyphs, Miao, Modi, Mongolian, Mro, Multani, Myanmar,
|
||||
Nabataean, New_Tai_Lue, Newa, Nko, Nushu, Ogham, Ol_Chiki, Old_Hungar-
|
||||
ian, Old_Italic, Old_North_Arabian, Old_Permic, Old_Persian, Old_Sog-
|
||||
dian, Old_South_Arabian, Old_Turkic, Oriya, Osage, Osmanya,
|
||||
Pahawh_Hmong, Palmyrene, Pau_Cin_Hau, Phags_Pa, Phoenician,
|
||||
Psalter_Pahlavi, Rejang, Runic, Samaritan, Saurashtra, Sharada, Sha-
|
||||
vian, Siddham, SignWriting, Sinhala, Sogdian, Sora_Sompeng, Soyombo,
|
||||
Sundanese, Syloti_Nagri, Syriac, Tagalog, Tagbanwa, Tai_Le, Tai_Tham,
|
||||
Tai_Viet, Takri, Tamil, Tangut, Telugu, Thaana, Thai, Tibetan, Tifi-
|
||||
nagh, Tirhuta, Ugaritic, Vai, Warang_Citi, Yi, Zanabazar_Square.
|
||||
|
||||
|
||||
CHARACTER CLASSES
|
||||
|
@ -9960,8 +9965,8 @@ CHARACTER CLASSES
|
|||
word same as \w
|
||||
xdigit hexadecimal digit
|
||||
|
||||
In PCRE2, POSIX character set names recognize only ASCII characters by
|
||||
default, but some of them use Unicode properties if PCRE2_UCP is set.
|
||||
In PCRE2, POSIX character set names recognize only ASCII characters by
|
||||
default, but some of them use Unicode properties if PCRE2_UCP is set.
|
||||
You can use \Q...\E inside a character class.
|
||||
|
||||
|
||||
|
@ -10047,8 +10052,8 @@ OPTION SETTING
|
|||
(?xx) as (?x) but also ignore space and tab in classes
|
||||
(?-...) unset option(s)
|
||||
|
||||
The following are recognized only at the very start of a pattern or
|
||||
after one of the newline or \R options with similar syntax. More than
|
||||
The following are recognized only at the very start of a pattern or
|
||||
after one of the newline or \R options with similar syntax. More than
|
||||
one of them may appear. For the first three, d is a decimal number.
|
||||
|
||||
(*LIMIT_DEPTH=d) set the backtracking limit to d
|
||||
|
@ -10063,17 +10068,17 @@ OPTION SETTING
|
|||
(*UTF) set appropriate UTF mode for the library in use
|
||||
(*UCP) set PCRE2_UCP (use Unicode properties for \d etc)
|
||||
|
||||
Note that LIMIT_DEPTH, LIMIT_HEAP, and LIMIT_MATCH can only reduce the
|
||||
value of the limits set by the caller of pcre2_match() or
|
||||
pcre2_dfa_match(), not increase them. LIMIT_RECURSION is an obsolete
|
||||
Note that LIMIT_DEPTH, LIMIT_HEAP, and LIMIT_MATCH can only reduce the
|
||||
value of the limits set by the caller of pcre2_match() or
|
||||
pcre2_dfa_match(), not increase them. LIMIT_RECURSION is an obsolete
|
||||
synonym for LIMIT_DEPTH. The application can lock out the use of (*UTF)
|
||||
and (*UCP) by setting the PCRE2_NEVER_UTF or PCRE2_NEVER_UCP options,
|
||||
and (*UCP) by setting the PCRE2_NEVER_UTF or PCRE2_NEVER_UCP options,
|
||||
respectively, at compile time.
|
||||
|
||||
|
||||
NEWLINE CONVENTION
|
||||
|
||||
These are recognized only at the very start of the pattern or after
|
||||
These are recognized only at the very start of the pattern or after
|
||||
option settings with a similar syntax.
|
||||
|
||||
(*CR) carriage return only
|
||||
|
@ -10086,7 +10091,7 @@ NEWLINE CONVENTION
|
|||
|
||||
WHAT \R MATCHES
|
||||
|
||||
These are recognized only at the very start of the pattern or after
|
||||
These are recognized only at the very start of the pattern or after
|
||||
option setting with a similar syntax.
|
||||
|
||||
(*BSR_ANYCRLF) CR, LF, or CRLF
|
||||
|
@ -10155,8 +10160,8 @@ CONDITIONAL PATTERNS
|
|||
(?(VERSION[>]=n.m) test PCRE2 version
|
||||
(?(assert) assertion condition
|
||||
|
||||
Note the ambiguity of (?(R) and (?(Rn) which might be named reference
|
||||
conditions or recursion tests. Such a condition is interpreted as a
|
||||
Note the ambiguity of (?(R) and (?(Rn) which might be named reference
|
||||
conditions or recursion tests. Such a condition is interpreted as a
|
||||
reference condition if the relevant named group exists.
|
||||
|
||||
|
||||
|
@ -10168,7 +10173,7 @@ BACKTRACKING CONTROL
|
|||
(*FAIL) force backtrack; synonym (*F)
|
||||
(*MARK:NAME) set name to be passed back; synonym (*:NAME)
|
||||
|
||||
The following act only when a subsequent match failure causes a back-
|
||||
The following act only when a subsequent match failure causes a back-
|
||||
track to reach them. They all force a match failure, but they differ in
|
||||
what happens afterwards. Those that advance the start-of-match point do
|
||||
so only if the pattern is not anchored.
|
||||
|
@ -10190,14 +10195,14 @@ CALLOUTS
|
|||
(?C"text") callout with string data
|
||||
|
||||
The allowed string delimiters are ` ' " ^ % # $ (which are the same for
|
||||
the start and the end), and the starting delimiter { matched with the
|
||||
ending delimiter }. To encode the ending delimiter within the string,
|
||||
the start and the end), and the starting delimiter { matched with the
|
||||
ending delimiter }. To encode the ending delimiter within the string,
|
||||
double it.
|
||||
|
||||
|
||||
SEE ALSO
|
||||
|
||||
pcre2pattern(3), pcre2api(3), pcre2callout(3), pcre2matching(3),
|
||||
pcre2pattern(3), pcre2api(3), pcre2callout(3), pcre2matching(3),
|
||||
pcre2(3).
|
||||
|
||||
|
||||
|
@ -10210,7 +10215,7 @@ AUTHOR
|
|||
|
||||
REVISION
|
||||
|
||||
Last updated: 28 June 2018
|
||||
Last updated: 07 July 2018
|
||||
Copyright (c) 1997-2018 University of Cambridge.
|
||||
------------------------------------------------------------------------------
|
||||
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
.TH PCRE2PATTERN 3 "30 June 2018" "PCRE2 10.32"
|
||||
.TH PCRE2PATTERN 3 "07 July 2018" "PCRE2 10.32"
|
||||
.SH NAME
|
||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||
.SH "PCRE2 REGULAR EXPRESSION DETAILS"
|
||||
|
@ -788,6 +788,7 @@ Cypriot,
|
|||
Cyrillic,
|
||||
Deseret,
|
||||
Devanagari,
|
||||
Dogra,
|
||||
Duployan,
|
||||
Egyptian_Hieroglyphs,
|
||||
Elbasan,
|
||||
|
@ -798,9 +799,11 @@ Gothic,
|
|||
Grantha,
|
||||
Greek,
|
||||
Gujarati,
|
||||
Gunjala_Gondi,
|
||||
Gurmukhi,
|
||||
Han,
|
||||
Hangul,
|
||||
Hanifi_Rohingya,
|
||||
Hanunoo,
|
||||
Hatran,
|
||||
Hebrew,
|
||||
|
@ -828,11 +831,13 @@ Lisu,
|
|||
Lycian,
|
||||
Lydian,
|
||||
Mahajani,
|
||||
Makasar,
|
||||
Malayalam,
|
||||
Mandaic,
|
||||
Manichaean,
|
||||
Marchen,
|
||||
Masaram_Gondi,
|
||||
Medefaidrin,
|
||||
Meetei_Mayek,
|
||||
Mende_Kikakui,
|
||||
Meroitic_Cursive,
|
||||
|
@ -855,6 +860,7 @@ Old_Italic,
|
|||
Old_North_Arabian,
|
||||
Old_Permic,
|
||||
Old_Persian,
|
||||
Old_Sogdian,
|
||||
Old_South_Arabian,
|
||||
Old_Turkic,
|
||||
Oriya,
|
||||
|
@ -875,6 +881,7 @@ Shavian,
|
|||
Siddham,
|
||||
SignWriting,
|
||||
Sinhala,
|
||||
Sogdian,
|
||||
Sora_Sompeng,
|
||||
Soyombo,
|
||||
Sundanese,
|
||||
|
@ -1003,7 +1010,10 @@ grapheme cluster", and treats the sequence as an atomic group
|
|||
Unicode supports various kinds of composite character by giving each character
|
||||
a grapheme breaking property, and having rules that use these properties to
|
||||
define the boundaries of extended grapheme clusters. The rules are defined in
|
||||
Unicode Standard Annex 29, "Unicode Text Segmentation".
|
||||
Unicode Standard Annex 29, "Unicode Text Segmentation". Unicode 11.0.0
|
||||
abandoned the use of some previous properties that had been used for emojis.
|
||||
Instead it introduced various emoji-specific properties. PCRE2 uses only the
|
||||
Extended Pictographic property.
|
||||
.P
|
||||
\eX always matches at least one character. Then it decides whether to add
|
||||
additional characters according to the following rules for ending a cluster:
|
||||
|
@ -1018,22 +1028,20 @@ L, V, LV, or LVT character; an LV or V character may be followed by a V or T
|
|||
character; an LVT or T character may be follwed only by a T character.
|
||||
.P
|
||||
4. Do not end before extending characters or spacing marks or the "zero-width
|
||||
joiner" characters. Characters with the "mark" property always have the
|
||||
joiner" character. Characters with the "mark" property always have the
|
||||
"extend" grapheme breaking property.
|
||||
.P
|
||||
5. Do not end after prepend characters.
|
||||
.P
|
||||
6. Do not break within emoji modifier sequences (a base character followed by a
|
||||
modifier). Extending characters are allowed before the modifier.
|
||||
6. Do not break within emoji modifier sequences or emoji zwj sequences. That
|
||||
is, do not break between characters with the Extended_Pictographic property.
|
||||
Extend and ZWJ characters are allowed between the characters.
|
||||
.P
|
||||
7. Do not break within emoji zwj sequences (zero-width joiner followed by
|
||||
"glue after ZWJ" or "base glue after ZWJ").
|
||||
.P
|
||||
8. Do not break within emoji flag sequences. That is, do not break between
|
||||
7. Do not break within emoji flag sequences. That is, do not break between
|
||||
regional indicator (RI) characters if there are an odd number of RI characters
|
||||
before the break point.
|
||||
.P
|
||||
6. Otherwise, end the cluster.
|
||||
8. Otherwise, end the cluster.
|
||||
.
|
||||
.
|
||||
.\" HTML <a name="extraprops"></a>
|
||||
|
@ -3517,6 +3525,6 @@ Cambridge, England.
|
|||
.rs
|
||||
.sp
|
||||
.nf
|
||||
Last updated: 30 June 2018
|
||||
Last updated: 07 July 2018
|
||||
Copyright (c) 1997-2018 University of Cambridge.
|
||||
.fi
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
.TH PCRE2SYNTAX 3 "28 June 2018" "PCRE2 10.32"
|
||||
.TH PCRE2SYNTAX 3 "07 July 2018" "PCRE2 10.32"
|
||||
.SH NAME
|
||||
PCRE2 - Perl-compatible regular expressions (revised API)
|
||||
.SH "PCRE2 REGULAR EXPRESSION SYNTAX SUMMARY"
|
||||
|
@ -160,6 +160,7 @@ at release 5.18.
|
|||
.SH "SCRIPT NAMES FOR \ep AND \eP"
|
||||
.rs
|
||||
.sp
|
||||
Adlam,
|
||||
Ahom,
|
||||
Anatolian_Hieroglyphs,
|
||||
Arabic,
|
||||
|
@ -170,6 +171,7 @@ Bamum,
|
|||
Bassa_Vah,
|
||||
Batak,
|
||||
Bengali,
|
||||
Bhaiksuki,
|
||||
Bopomofo,
|
||||
Brahmi,
|
||||
Braille,
|
||||
|
@ -188,6 +190,7 @@ Cypriot,
|
|||
Cyrillic,
|
||||
Deseret,
|
||||
Devanagari,
|
||||
Dogra,
|
||||
Duployan,
|
||||
Egyptian_Hieroglyphs,
|
||||
Elbasan,
|
||||
|
@ -198,9 +201,11 @@ Gothic,
|
|||
Grantha,
|
||||
Greek,
|
||||
Gujarati,
|
||||
Gunjala_Gondi,
|
||||
Gurmukhi,
|
||||
Han,
|
||||
Hangul,
|
||||
Hanifi_Rohingya,
|
||||
Hanunoo,
|
||||
Hatran,
|
||||
Hebrew,
|
||||
|
@ -228,9 +233,13 @@ Lisu,
|
|||
Lycian,
|
||||
Lydian,
|
||||
Mahajani,
|
||||
Makasar,
|
||||
Malayalam,
|
||||
Mandaic,
|
||||
Manichaean,
|
||||
Marchen,
|
||||
Masaram_Gondi,
|
||||
Medefaidrin,
|
||||
Meetei_Mayek,
|
||||
Mende_Kikakui,
|
||||
Meroitic_Cursive,
|
||||
|
@ -243,7 +252,9 @@ Multani,
|
|||
Myanmar,
|
||||
Nabataean,
|
||||
New_Tai_Lue,
|
||||
Newa,
|
||||
Nko,
|
||||
Nushu,
|
||||
Ogham,
|
||||
Ol_Chiki,
|
||||
Old_Hungarian,
|
||||
|
@ -251,9 +262,11 @@ Old_Italic,
|
|||
Old_North_Arabian,
|
||||
Old_Permic,
|
||||
Old_Persian,
|
||||
Old_Sogdian,
|
||||
Old_South_Arabian,
|
||||
Old_Turkic,
|
||||
Oriya,
|
||||
Osage,
|
||||
Osmanya,
|
||||
Pahawh_Hmong,
|
||||
Palmyrene,
|
||||
|
@ -270,7 +283,9 @@ Shavian,
|
|||
Siddham,
|
||||
SignWriting,
|
||||
Sinhala,
|
||||
Sogdian,
|
||||
Sora_Sompeng,
|
||||
Soyombo,
|
||||
Sundanese,
|
||||
Syloti_Nagri,
|
||||
Syriac,
|
||||
|
@ -281,6 +296,7 @@ Tai_Tham,
|
|||
Tai_Viet,
|
||||
Takri,
|
||||
Tamil,
|
||||
Tangut,
|
||||
Telugu,
|
||||
Thaana,
|
||||
Thai,
|
||||
|
@ -290,7 +306,8 @@ Tirhuta,
|
|||
Ugaritic,
|
||||
Vai,
|
||||
Warang_Citi,
|
||||
Yi.
|
||||
Yi,
|
||||
Zanabazar_Square.
|
||||
.
|
||||
.
|
||||
.SH "CHARACTER CLASSES"
|
||||
|
@ -589,6 +606,6 @@ Cambridge, England.
|
|||
.rs
|
||||
.sp
|
||||
.nf
|
||||
Last updated: 28 June 2018
|
||||
Last updated: 07 July 2018
|
||||
Copyright (c) 1997-2018 University of Cambridge.
|
||||
.fi
|
||||
|
|
|
@ -24,6 +24,7 @@
|
|||
# Added script names for Unicode 7.0.0, 20-June-2014.
|
||||
# Added script names for Unicode 8.0.0, 19-June-2015.
|
||||
# Added script names for Unicode 10.0.0, 02-July-2017.
|
||||
# Added script names for Unicode 11.0.0, 03-July-2018.
|
||||
|
||||
script_names = ['Arabic', 'Armenian', 'Bengali', 'Bopomofo', 'Braille', 'Buginese', 'Buhid', 'Canadian_Aboriginal', \
|
||||
'Cherokee', 'Common', 'Coptic', 'Cypriot', 'Cyrillic', 'Deseret', 'Devanagari', 'Ethiopic', 'Georgian', \
|
||||
|
@ -55,7 +56,10 @@ script_names = ['Arabic', 'Armenian', 'Bengali', 'Bopomofo', 'Braille', 'Bugines
|
|||
'SignWriting',
|
||||
# New for Unicode 10.0.0
|
||||
'Adlam', 'Bhaiksuki', 'Marchen', 'Newa', 'Osage', 'Tangut', 'Masaram_Gondi',
|
||||
'Nushu', 'Soyombo', 'Zanabazar_Square'
|
||||
'Nushu', 'Soyombo', 'Zanabazar_Square',
|
||||
# New for Unicode 11.0.0
|
||||
'Dogra', 'Gunjala_Gondi', 'Hanifi_Rohingya', 'Makasar', 'Medefaidrin',
|
||||
'Old_Sogdian', 'Sogdian'
|
||||
]
|
||||
|
||||
category_names = ['Cc', 'Cf', 'Cn', 'Co', 'Cs', 'Ll', 'Lm', 'Lo', 'Lt', 'Lu',
|
||||
|
|
|
@ -7,7 +7,7 @@
|
|||
# This script was submitted to the PCRE project by Peter Kankowski as part of
|
||||
# the upgrading of Unicode property support. The new code speeds up property
|
||||
# matching many times. The script is for the use of PCRE maintainers, to
|
||||
# generate the pcre_ucd.c file that contains a digested form of the Unicode
|
||||
# generate the pcre2_ucd.c file that contains a digested form of the Unicode
|
||||
# data tables.
|
||||
#
|
||||
# The script has now been upgraded to Python 3 for PCRE2, and should be run in
|
||||
|
@ -15,12 +15,18 @@
|
|||
#
|
||||
# [python3] ./MultiStage2.py >../src/pcre2_ucd.c
|
||||
#
|
||||
# It requires four Unicode data tables, DerivedGeneralCategory.txt,
|
||||
# GraphemeBreakProperty.txt, Scripts.txt, and CaseFolding.txt, to be in the
|
||||
# Unicode.tables subdirectory. The first of these is found in the "extracted"
|
||||
# subdirectory of the Unicode database (UCD) on the Unicode web site; the
|
||||
# second is in the "auxiliary" subdirectory; the other two are directly in the
|
||||
# UCD directory.
|
||||
# It requires five Unicode data tables: DerivedGeneralCategory.txt,
|
||||
# GraphemeBreakProperty.txt, Scripts.txt, CaseFolding.txt, and emoji-data.txt.
|
||||
# These must be in the maint/Unicode.tables subdirectory.
|
||||
#
|
||||
# DerivedGeneralCategory.txt is found in the "extracted" subdirectory of the
|
||||
# Unicode database (UCD) on the Unicode web site; GraphemeBreakProperty.txt is
|
||||
# in the "auxiliary" subdirectory. Scripts.txt and CaseFolding.txt are directly
|
||||
# in the UCD directory. The emoji-data.txt file is in files associated with
|
||||
# Unicode Technical Standard #51 ("Unicode Emoji"), for example:
|
||||
#
|
||||
# http://unicode.org/Public/emoji/11.0/emoji-data.txt
|
||||
#
|
||||
#
|
||||
# Minor modifications made to this script:
|
||||
# Added #! line at start
|
||||
|
@ -41,7 +47,8 @@
|
|||
# Added code to search for sets of more than two characters that must match
|
||||
# each other caselessly. A new table is output containing these sets, and
|
||||
# offsets into the table are added to the main output records. This new
|
||||
# code scans CaseFolding.txt instead of UnicodeData.txt.
|
||||
# code scans CaseFolding.txt instead of UnicodeData.txt, which is no longer
|
||||
# used.
|
||||
#
|
||||
# Update for Python3:
|
||||
# . Processed with 2to3, but that didn't fix everything
|
||||
|
@ -50,6 +57,11 @@
|
|||
# . Inserted 'int' before blocksize/ELEMS_PER_LINE because an int is
|
||||
# required and the result of the division is a float
|
||||
#
|
||||
# Added code to scan the emoji-data.txt file to find the Extended Pictographic
|
||||
# property, which is used by PCRE2 as a grapheme breaking property. This was
|
||||
# done when updating to Unicode 11.0.0 (July 2018).
|
||||
#
|
||||
#
|
||||
# The main tables generated by this script are used by macros defined in
|
||||
# pcre2_internal.h. They look up Unicode character properties using short
|
||||
# sequences of code that contains no branches, which makes for greater speed.
|
||||
|
@ -75,13 +87,16 @@
|
|||
# table of "virtual" blocks; each block is indexed by the offset of a character
|
||||
# within its own block, and the result is the offset of the required record.
|
||||
#
|
||||
# The following examples are correct for the Unicode 11.0.0 database. Future
|
||||
# updates may make change the actual lookup values.
|
||||
#
|
||||
# Example: lowercase "a" (U+0061) is in block 0
|
||||
# lookup 0 in stage1 table yields 0
|
||||
# lookup 97 in the first table in stage2 yields 16
|
||||
# record 17 is { 33, 5, 11, 0, -32 }
|
||||
# 33 = ucp_Latin => Latin script
|
||||
# 5 = ucp_Ll => Lower case letter
|
||||
# 11 = ucp_gbOther => Grapheme break property "Other"
|
||||
# 12 = ucp_gbOther => Grapheme break property "Other"
|
||||
# 0 => not part of a caseless set
|
||||
# -32 => Other case is U+0041
|
||||
#
|
||||
|
@ -90,12 +105,12 @@
|
|||
# example, k, K and the Kelvin symbol are such a set).
|
||||
#
|
||||
# Example: hiragana letter A (U+3042) is in block 96 (0x60)
|
||||
# lookup 96 in stage1 table yields 88
|
||||
# lookup 66 in the 88th table in stage2 yields 467
|
||||
# record 470 is { 26, 7, 11, 0, 0 }
|
||||
# lookup 96 in stage1 table yields 90
|
||||
# lookup 66 in the 90th table in stage2 yields 515
|
||||
# record 515 is { 26, 7, 11, 0, 0 }
|
||||
# 26 = ucp_Hiragana => Hiragana script
|
||||
# 7 = ucp_Lo => Other letter
|
||||
# 11 = ucp_gbOther => Grapheme break property "Other"
|
||||
# 12 = ucp_gbOther => Grapheme break property "Other"
|
||||
# 0 => not part of a caseless set
|
||||
# 0 => No other case
|
||||
#
|
||||
|
@ -106,6 +121,8 @@
|
|||
# individual character types such as ucp_Cc to the general types like ucp_C.
|
||||
#
|
||||
# Philip Hazel, 03 July 2008
|
||||
# Last Updated: 07 July 2018
|
||||
#
|
||||
#
|
||||
# 01-March-2010: Updated list of scripts for Unicode 5.2.0
|
||||
# 30-April-2011: Updated list of scripts for Unicode 6.0.0
|
||||
|
@ -123,6 +140,9 @@
|
|||
# 12-August-2014: Updated to put Unicode version into the file
|
||||
# 19-June-2015: Updated for Unicode 8.0.0
|
||||
# 02-July-2017: Updated for Unicode 10.0.0
|
||||
# 03-July-2018: Updated for Unicode 11.0.0
|
||||
# 07-July-2018: Added code to scan emoji-data.txt for the Extended
|
||||
# Pictographic property.
|
||||
##############################################################################
|
||||
|
||||
|
||||
|
@ -339,16 +359,23 @@ script_names = ['Arabic', 'Armenian', 'Bengali', 'Bopomofo', 'Braille', 'Bugines
|
|||
'SignWriting',
|
||||
# New for Unicode 10.0.0
|
||||
'Adlam', 'Bhaiksuki', 'Marchen', 'Newa', 'Osage', 'Tangut', 'Masaram_Gondi',
|
||||
'Nushu', 'Soyombo', 'Zanabazar_Square'
|
||||
'Nushu', 'Soyombo', 'Zanabazar_Square',
|
||||
# New for Unicode 11.0.0
|
||||
'Dogra', 'Gunjala_Gondi', 'Hanifi_Rohingya', 'Makasar', 'Medefaidrin',
|
||||
'Old_Sogdian', 'Sogdian'
|
||||
]
|
||||
|
||||
category_names = ['Cc', 'Cf', 'Cn', 'Co', 'Cs', 'Ll', 'Lm', 'Lo', 'Lt', 'Lu',
|
||||
'Mc', 'Me', 'Mn', 'Nd', 'Nl', 'No', 'Pc', 'Pd', 'Pe', 'Pf', 'Pi', 'Po', 'Ps',
|
||||
'Sc', 'Sk', 'Sm', 'So', 'Zl', 'Zp', 'Zs' ]
|
||||
|
||||
# The Extended_Pictographic property is not found in the file where all the
|
||||
# others are (GraphemeBreakProperty.txt). It comes from the emoji-data.txt
|
||||
# file, but we list it here so that the name has the correct index value.
|
||||
|
||||
break_property_names = ['CR', 'LF', 'Control', 'Extend', 'Prepend',
|
||||
'SpacingMark', 'L', 'V', 'T', 'LV', 'LVT', 'Regional_Indicator', 'Other',
|
||||
'E_Base', 'E_Modifier', 'E_Base_GAZ', 'ZWJ', 'Glue_After_Zwj' ]
|
||||
'ZWJ', 'Extended_Pictographic' ]
|
||||
|
||||
test_record_size()
|
||||
unicode_version = ""
|
||||
|
@ -358,6 +385,35 @@ category = read_table('Unicode.tables/DerivedGeneralCategory.txt', make_get_name
|
|||
break_props = read_table('Unicode.tables/GraphemeBreakProperty.txt', make_get_names(break_property_names), break_property_names.index('Other'))
|
||||
other_case = read_table('Unicode.tables/CaseFolding.txt', get_other_case, 0)
|
||||
|
||||
# The grapheme breaking rules were changed for Unicode 11.0.0 (June 2018). Now
|
||||
# we need to find the Extended_Pictographic property for emoji characters. This
|
||||
# can be set as an additional grapheme break property, because the default for
|
||||
# all the emojis is "other". We scan the emoji-data.txt file and modify the
|
||||
# break-props table.
|
||||
|
||||
file = open('Unicode.tables/emoji-data.txt', 'r', encoding='utf-8')
|
||||
for line in file:
|
||||
line = re.sub(r'#.*', '', line)
|
||||
chardata = list(map(str.strip, line.split(';')))
|
||||
if len(chardata) <= 1:
|
||||
continue
|
||||
|
||||
if chardata[1] != "Extended_Pictographic":
|
||||
continue
|
||||
|
||||
m = re.match(r'([0-9a-fA-F]+)(\.\.([0-9a-fA-F]+))?$', chardata[0])
|
||||
char = int(m.group(1), 16)
|
||||
if m.group(3) is None:
|
||||
last = char
|
||||
else:
|
||||
last = int(m.group(3), 16)
|
||||
for i in range(char, last + 1):
|
||||
if break_props[i] != break_property_names.index('Other'):
|
||||
print("WARNING: Emoji 0x%x has break property %s, not 'Other'",
|
||||
i, break_property_names[break_props[i]], file=sys.stderr)
|
||||
break_props[i] = break_property_names.index('Extended_Pictographic')
|
||||
file.close()
|
||||
|
||||
|
||||
# This block of code was added by PH in September 2012. I am not a Python
|
||||
# programmer, so the style is probably dreadful, but it does the job. It scans
|
||||
|
@ -484,7 +540,7 @@ print("Instead, just supply small dummy tables. */")
|
|||
print()
|
||||
print("#ifndef SUPPORT_UNICODE")
|
||||
print("const ucd_record PRIV(ucd_records)[] = {{0,0,0,0,0 }};")
|
||||
print("const uint8_t PRIV(ucd_stage1)[] = {0};")
|
||||
print("const uint16_t PRIV(ucd_stage1)[] = {0};")
|
||||
print("const uint16_t PRIV(ucd_stage2)[] = {0};")
|
||||
print("const uint32_t PRIV(ucd_caseless_sets)[] = {0};")
|
||||
print("#else")
|
||||
|
|
20
maint/README
20
maint/README
|
@ -23,7 +23,7 @@ GenerateUtt.py A Python script to generate part of the pcre2_tables.c file
|
|||
ManyConfigTests A shell script that runs "configure, make, test" a number of
|
||||
times with different configuration settings.
|
||||
|
||||
MultiStage2.py A Python script that generates the file pcre2_ucd.c from three
|
||||
MultiStage2.py A Python script that generates the file pcre2_ucd.c from five
|
||||
Unicode data tables, which are themselves downloaded from the
|
||||
Unicode web site. Run this script in the "maint" directory.
|
||||
The generated file contains the tables for a 2-stage lookup
|
||||
|
@ -37,11 +37,17 @@ pcre2_chartables.c.non-standard
|
|||
|
||||
README This file.
|
||||
|
||||
Unicode.tables The files in this directory (CaseFolding.txt,
|
||||
DerivedGeneralCategory.txt, GraphemeBreakProperty.txt,
|
||||
Scripts.txt and UnicodeData.txt) were downloaded from the
|
||||
Unicode web site. They contain information about Unicode
|
||||
characters and scripts.
|
||||
Unicode.tables The files in this directory were downloaded from the Unicode
|
||||
web site. They contain information about Unicode characters
|
||||
and scripts. The ones used by the MultiStage2.py script are
|
||||
CaseFolding.txt, DerivedGeneralCategory.txt, Scripts.txt,
|
||||
GraphemeBreakProperty.txt, and emoji-data.txt. I've kept
|
||||
UnicodeData.txt (which is no longer used by the script)
|
||||
because it is useful occasionally for manually looking up the
|
||||
details of certain characters. However, note that character
|
||||
names in this file such as "Arabic sign sanah" do NOT mean
|
||||
that the character is in a particular script (in this case,
|
||||
Arabic). Scripts.txt is where to look for script information.
|
||||
|
||||
ucptest.c A short C program for testing the Unicode property macros
|
||||
that do lookups in the pcre2_ucd.c data, mainly useful after
|
||||
|
@ -359,4 +365,4 @@ very sensible; some are rather wacky. Some have been on this list for years.
|
|||
Philip Hazel
|
||||
Email local part: ph10
|
||||
Email domain: cam.ac.uk
|
||||
Last updated: 20 May 2017
|
||||
Last updated: 07 July 2018
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
# CaseFolding-10.0.0.txt
|
||||
# Date: 2017-04-14, 05:40:18 GMT
|
||||
# © 2017 Unicode®, Inc.
|
||||
# CaseFolding-11.0.0.txt
|
||||
# Date: 2018-01-31, 08:20:09 GMT
|
||||
# © 2018 Unicode®, Inc.
|
||||
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
|
||||
# For terms of use, see http://www.unicode.org/terms_of_use.html
|
||||
#
|
||||
|
@ -603,6 +603,52 @@
|
|||
1C86; C; 044A; # CYRILLIC SMALL LETTER TALL HARD SIGN
|
||||
1C87; C; 0463; # CYRILLIC SMALL LETTER TALL YAT
|
||||
1C88; C; A64B; # CYRILLIC SMALL LETTER UNBLENDED UK
|
||||
1C90; C; 10D0; # GEORGIAN MTAVRULI CAPITAL LETTER AN
|
||||
1C91; C; 10D1; # GEORGIAN MTAVRULI CAPITAL LETTER BAN
|
||||
1C92; C; 10D2; # GEORGIAN MTAVRULI CAPITAL LETTER GAN
|
||||
1C93; C; 10D3; # GEORGIAN MTAVRULI CAPITAL LETTER DON
|
||||
1C94; C; 10D4; # GEORGIAN MTAVRULI CAPITAL LETTER EN
|
||||
1C95; C; 10D5; # GEORGIAN MTAVRULI CAPITAL LETTER VIN
|
||||
1C96; C; 10D6; # GEORGIAN MTAVRULI CAPITAL LETTER ZEN
|
||||
1C97; C; 10D7; # GEORGIAN MTAVRULI CAPITAL LETTER TAN
|
||||
1C98; C; 10D8; # GEORGIAN MTAVRULI CAPITAL LETTER IN
|
||||
1C99; C; 10D9; # GEORGIAN MTAVRULI CAPITAL LETTER KAN
|
||||
1C9A; C; 10DA; # GEORGIAN MTAVRULI CAPITAL LETTER LAS
|
||||
1C9B; C; 10DB; # GEORGIAN MTAVRULI CAPITAL LETTER MAN
|
||||
1C9C; C; 10DC; # GEORGIAN MTAVRULI CAPITAL LETTER NAR
|
||||
1C9D; C; 10DD; # GEORGIAN MTAVRULI CAPITAL LETTER ON
|
||||
1C9E; C; 10DE; # GEORGIAN MTAVRULI CAPITAL LETTER PAR
|
||||
1C9F; C; 10DF; # GEORGIAN MTAVRULI CAPITAL LETTER ZHAR
|
||||
1CA0; C; 10E0; # GEORGIAN MTAVRULI CAPITAL LETTER RAE
|
||||
1CA1; C; 10E1; # GEORGIAN MTAVRULI CAPITAL LETTER SAN
|
||||
1CA2; C; 10E2; # GEORGIAN MTAVRULI CAPITAL LETTER TAR
|
||||
1CA3; C; 10E3; # GEORGIAN MTAVRULI CAPITAL LETTER UN
|
||||
1CA4; C; 10E4; # GEORGIAN MTAVRULI CAPITAL LETTER PHAR
|
||||
1CA5; C; 10E5; # GEORGIAN MTAVRULI CAPITAL LETTER KHAR
|
||||
1CA6; C; 10E6; # GEORGIAN MTAVRULI CAPITAL LETTER GHAN
|
||||
1CA7; C; 10E7; # GEORGIAN MTAVRULI CAPITAL LETTER QAR
|
||||
1CA8; C; 10E8; # GEORGIAN MTAVRULI CAPITAL LETTER SHIN
|
||||
1CA9; C; 10E9; # GEORGIAN MTAVRULI CAPITAL LETTER CHIN
|
||||
1CAA; C; 10EA; # GEORGIAN MTAVRULI CAPITAL LETTER CAN
|
||||
1CAB; C; 10EB; # GEORGIAN MTAVRULI CAPITAL LETTER JIL
|
||||
1CAC; C; 10EC; # GEORGIAN MTAVRULI CAPITAL LETTER CIL
|
||||
1CAD; C; 10ED; # GEORGIAN MTAVRULI CAPITAL LETTER CHAR
|
||||
1CAE; C; 10EE; # GEORGIAN MTAVRULI CAPITAL LETTER XAN
|
||||
1CAF; C; 10EF; # GEORGIAN MTAVRULI CAPITAL LETTER JHAN
|
||||
1CB0; C; 10F0; # GEORGIAN MTAVRULI CAPITAL LETTER HAE
|
||||
1CB1; C; 10F1; # GEORGIAN MTAVRULI CAPITAL LETTER HE
|
||||
1CB2; C; 10F2; # GEORGIAN MTAVRULI CAPITAL LETTER HIE
|
||||
1CB3; C; 10F3; # GEORGIAN MTAVRULI CAPITAL LETTER WE
|
||||
1CB4; C; 10F4; # GEORGIAN MTAVRULI CAPITAL LETTER HAR
|
||||
1CB5; C; 10F5; # GEORGIAN MTAVRULI CAPITAL LETTER HOE
|
||||
1CB6; C; 10F6; # GEORGIAN MTAVRULI CAPITAL LETTER FI
|
||||
1CB7; C; 10F7; # GEORGIAN MTAVRULI CAPITAL LETTER YN
|
||||
1CB8; C; 10F8; # GEORGIAN MTAVRULI CAPITAL LETTER ELIFI
|
||||
1CB9; C; 10F9; # GEORGIAN MTAVRULI CAPITAL LETTER TURNED GAN
|
||||
1CBA; C; 10FA; # GEORGIAN MTAVRULI CAPITAL LETTER AIN
|
||||
1CBD; C; 10FD; # GEORGIAN MTAVRULI CAPITAL LETTER AEN
|
||||
1CBE; C; 10FE; # GEORGIAN MTAVRULI CAPITAL LETTER HARD SIGN
|
||||
1CBF; C; 10FF; # GEORGIAN MTAVRULI CAPITAL LETTER LABIAL SIGN
|
||||
1E00; C; 1E01; # LATIN CAPITAL LETTER A WITH RING BELOW
|
||||
1E02; C; 1E03; # LATIN CAPITAL LETTER B WITH DOT ABOVE
|
||||
1E04; C; 1E05; # LATIN CAPITAL LETTER B WITH DOT BELOW
|
||||
|
@ -1180,6 +1226,7 @@ A7B2; C; 029D; # LATIN CAPITAL LETTER J WITH CROSSED-TAIL
|
|||
A7B3; C; AB53; # LATIN CAPITAL LETTER CHI
|
||||
A7B4; C; A7B5; # LATIN CAPITAL LETTER BETA
|
||||
A7B6; C; A7B7; # LATIN CAPITAL LETTER OMEGA
|
||||
A7B8; C; A7B9; # LATIN CAPITAL LETTER U WITH STROKE
|
||||
AB70; C; 13A0; # CHEROKEE SMALL LETTER A
|
||||
AB71; C; 13A1; # CHEROKEE SMALL LETTER E
|
||||
AB72; C; 13A2; # CHEROKEE SMALL LETTER I
|
||||
|
@ -1457,6 +1504,38 @@ FF3A; C; FF5A; # FULLWIDTH LATIN CAPITAL LETTER Z
|
|||
118BD; C; 118DD; # WARANG CITI CAPITAL LETTER SSUU
|
||||
118BE; C; 118DE; # WARANG CITI CAPITAL LETTER SII
|
||||
118BF; C; 118DF; # WARANG CITI CAPITAL LETTER VIYO
|
||||
16E40; C; 16E60; # MEDEFAIDRIN CAPITAL LETTER M
|
||||
16E41; C; 16E61; # MEDEFAIDRIN CAPITAL LETTER S
|
||||
16E42; C; 16E62; # MEDEFAIDRIN CAPITAL LETTER V
|
||||
16E43; C; 16E63; # MEDEFAIDRIN CAPITAL LETTER W
|
||||
16E44; C; 16E64; # MEDEFAIDRIN CAPITAL LETTER ATIU
|
||||
16E45; C; 16E65; # MEDEFAIDRIN CAPITAL LETTER Z
|
||||
16E46; C; 16E66; # MEDEFAIDRIN CAPITAL LETTER KP
|
||||
16E47; C; 16E67; # MEDEFAIDRIN CAPITAL LETTER P
|
||||
16E48; C; 16E68; # MEDEFAIDRIN CAPITAL LETTER T
|
||||
16E49; C; 16E69; # MEDEFAIDRIN CAPITAL LETTER G
|
||||
16E4A; C; 16E6A; # MEDEFAIDRIN CAPITAL LETTER F
|
||||
16E4B; C; 16E6B; # MEDEFAIDRIN CAPITAL LETTER I
|
||||
16E4C; C; 16E6C; # MEDEFAIDRIN CAPITAL LETTER K
|
||||
16E4D; C; 16E6D; # MEDEFAIDRIN CAPITAL LETTER A
|
||||
16E4E; C; 16E6E; # MEDEFAIDRIN CAPITAL LETTER J
|
||||
16E4F; C; 16E6F; # MEDEFAIDRIN CAPITAL LETTER E
|
||||
16E50; C; 16E70; # MEDEFAIDRIN CAPITAL LETTER B
|
||||
16E51; C; 16E71; # MEDEFAIDRIN CAPITAL LETTER C
|
||||
16E52; C; 16E72; # MEDEFAIDRIN CAPITAL LETTER U
|
||||
16E53; C; 16E73; # MEDEFAIDRIN CAPITAL LETTER YU
|
||||
16E54; C; 16E74; # MEDEFAIDRIN CAPITAL LETTER L
|
||||
16E55; C; 16E75; # MEDEFAIDRIN CAPITAL LETTER Q
|
||||
16E56; C; 16E76; # MEDEFAIDRIN CAPITAL LETTER HP
|
||||
16E57; C; 16E77; # MEDEFAIDRIN CAPITAL LETTER NY
|
||||
16E58; C; 16E78; # MEDEFAIDRIN CAPITAL LETTER X
|
||||
16E59; C; 16E79; # MEDEFAIDRIN CAPITAL LETTER D
|
||||
16E5A; C; 16E7A; # MEDEFAIDRIN CAPITAL LETTER OE
|
||||
16E5B; C; 16E7B; # MEDEFAIDRIN CAPITAL LETTER N
|
||||
16E5C; C; 16E7C; # MEDEFAIDRIN CAPITAL LETTER R
|
||||
16E5D; C; 16E7D; # MEDEFAIDRIN CAPITAL LETTER O
|
||||
16E5E; C; 16E7E; # MEDEFAIDRIN CAPITAL LETTER AI
|
||||
16E5F; C; 16E7F; # MEDEFAIDRIN CAPITAL LETTER Y
|
||||
1E900; C; 1E922; # ADLAM CAPITAL LETTER ALIF
|
||||
1E901; C; 1E923; # ADLAM CAPITAL LETTER DAALI
|
||||
1E902; C; 1E924; # ADLAM CAPITAL LETTER LAAM
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
# DerivedGeneralCategory-10.0.0.txt
|
||||
# Date: 2017-03-08, 08:41:49 GMT
|
||||
# © 2017 Unicode®, Inc.
|
||||
# DerivedGeneralCategory-11.0.0.txt
|
||||
# Date: 2018-02-21, 05:34:04 GMT
|
||||
# © 2018 Unicode®, Inc.
|
||||
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
|
||||
# For terms of use, see http://www.unicode.org/terms_of_use.html
|
||||
#
|
||||
|
@ -22,25 +22,23 @@
|
|||
03A2 ; Cn # <reserved-03A2>
|
||||
0530 ; Cn # <reserved-0530>
|
||||
0557..0558 ; Cn # [2] <reserved-0557>..<reserved-0558>
|
||||
0560 ; Cn # <reserved-0560>
|
||||
0588 ; Cn # <reserved-0588>
|
||||
058B..058C ; Cn # [2] <reserved-058B>..<reserved-058C>
|
||||
0590 ; Cn # <reserved-0590>
|
||||
05C8..05CF ; Cn # [8] <reserved-05C8>..<reserved-05CF>
|
||||
05EB..05EF ; Cn # [5] <reserved-05EB>..<reserved-05EF>
|
||||
05EB..05EE ; Cn # [4] <reserved-05EB>..<reserved-05EE>
|
||||
05F5..05FF ; Cn # [11] <reserved-05F5>..<reserved-05FF>
|
||||
061D ; Cn # <reserved-061D>
|
||||
070E ; Cn # <reserved-070E>
|
||||
074B..074C ; Cn # [2] <reserved-074B>..<reserved-074C>
|
||||
07B2..07BF ; Cn # [14] <reserved-07B2>..<reserved-07BF>
|
||||
07FB..07FF ; Cn # [5] <reserved-07FB>..<reserved-07FF>
|
||||
07FB..07FC ; Cn # [2] <reserved-07FB>..<reserved-07FC>
|
||||
082E..082F ; Cn # [2] <reserved-082E>..<reserved-082F>
|
||||
083F ; Cn # <reserved-083F>
|
||||
085C..085D ; Cn # [2] <reserved-085C>..<reserved-085D>
|
||||
085F ; Cn # <reserved-085F>
|
||||
086B..089F ; Cn # [53] <reserved-086B>..<reserved-089F>
|
||||
08B5 ; Cn # <reserved-08B5>
|
||||
08BE..08D3 ; Cn # [22] <reserved-08BE>..<reserved-08D3>
|
||||
08BE..08D2 ; Cn # [21] <reserved-08BE>..<reserved-08D2>
|
||||
0984 ; Cn # <reserved-0984>
|
||||
098D..098E ; Cn # [2] <reserved-098D>..<reserved-098E>
|
||||
0991..0992 ; Cn # [2] <reserved-0991>..<reserved-0992>
|
||||
|
@ -54,7 +52,7 @@
|
|||
09D8..09DB ; Cn # [4] <reserved-09D8>..<reserved-09DB>
|
||||
09DE ; Cn # <reserved-09DE>
|
||||
09E4..09E5 ; Cn # [2] <reserved-09E4>..<reserved-09E5>
|
||||
09FE..0A00 ; Cn # [3] <reserved-09FE>..<reserved-0A00>
|
||||
09FF..0A00 ; Cn # [2] <reserved-09FF>..<reserved-0A00>
|
||||
0A04 ; Cn # <reserved-0A04>
|
||||
0A0B..0A0E ; Cn # [4] <reserved-0A0B>..<reserved-0A0E>
|
||||
0A11..0A12 ; Cn # [2] <reserved-0A11>..<reserved-0A12>
|
||||
|
@ -70,7 +68,7 @@
|
|||
0A52..0A58 ; Cn # [7] <reserved-0A52>..<reserved-0A58>
|
||||
0A5D ; Cn # <reserved-0A5D>
|
||||
0A5F..0A65 ; Cn # [7] <reserved-0A5F>..<reserved-0A65>
|
||||
0A76..0A80 ; Cn # [11] <reserved-0A76>..<reserved-0A80>
|
||||
0A77..0A80 ; Cn # [10] <reserved-0A77>..<reserved-0A80>
|
||||
0A84 ; Cn # <reserved-0A84>
|
||||
0A8E ; Cn # <reserved-0A8E>
|
||||
0A92 ; Cn # <reserved-0A92>
|
||||
|
@ -115,7 +113,6 @@
|
|||
0BD1..0BD6 ; Cn # [6] <reserved-0BD1>..<reserved-0BD6>
|
||||
0BD8..0BE5 ; Cn # [14] <reserved-0BD8>..<reserved-0BE5>
|
||||
0BFB..0BFF ; Cn # [5] <reserved-0BFB>..<reserved-0BFF>
|
||||
0C04 ; Cn # <reserved-0C04>
|
||||
0C0D ; Cn # <reserved-0C0D>
|
||||
0C11 ; Cn # <reserved-0C11>
|
||||
0C29 ; Cn # <reserved-0C29>
|
||||
|
@ -127,7 +124,6 @@
|
|||
0C5B..0C5F ; Cn # [5] <reserved-0C5B>..<reserved-0C5F>
|
||||
0C64..0C65 ; Cn # [2] <reserved-0C64>..<reserved-0C65>
|
||||
0C70..0C77 ; Cn # [8] <reserved-0C70>..<reserved-0C77>
|
||||
0C84 ; Cn # <reserved-0C84>
|
||||
0C8D ; Cn # <reserved-0C8D>
|
||||
0C91 ; Cn # <reserved-0C91>
|
||||
0CA9 ; Cn # <reserved-0CA9>
|
||||
|
@ -224,7 +220,7 @@
|
|||
17FA..17FF ; Cn # [6] <reserved-17FA>..<reserved-17FF>
|
||||
180F ; Cn # <reserved-180F>
|
||||
181A..181F ; Cn # [6] <reserved-181A>..<reserved-181F>
|
||||
1878..187F ; Cn # [8] <reserved-1878>..<reserved-187F>
|
||||
1879..187F ; Cn # [7] <reserved-1879>..<reserved-187F>
|
||||
18AB..18AF ; Cn # [5] <reserved-18AB>..<reserved-18AF>
|
||||
18F6..18FF ; Cn # [10] <reserved-18F6>..<reserved-18FF>
|
||||
191F ; Cn # <reserved-191F>
|
||||
|
@ -248,7 +244,8 @@
|
|||
1BF4..1BFB ; Cn # [8] <reserved-1BF4>..<reserved-1BFB>
|
||||
1C38..1C3A ; Cn # [3] <reserved-1C38>..<reserved-1C3A>
|
||||
1C4A..1C4C ; Cn # [3] <reserved-1C4A>..<reserved-1C4C>
|
||||
1C89..1CBF ; Cn # [55] <reserved-1C89>..<reserved-1CBF>
|
||||
1C89..1C8F ; Cn # [7] <reserved-1C89>..<reserved-1C8F>
|
||||
1CBB..1CBC ; Cn # [2] <reserved-1CBB>..<reserved-1CBC>
|
||||
1CC8..1CCF ; Cn # [8] <reserved-1CC8>..<reserved-1CCF>
|
||||
1CFA..1CFF ; Cn # [6] <reserved-1CFA>..<reserved-1CFF>
|
||||
1DFA ; Cn # <reserved-1DFA>
|
||||
|
@ -279,10 +276,8 @@
|
|||
244B..245F ; Cn # [21] <reserved-244B>..<reserved-245F>
|
||||
2B74..2B75 ; Cn # [2] <reserved-2B74>..<reserved-2B75>
|
||||
2B96..2B97 ; Cn # [2] <reserved-2B96>..<reserved-2B97>
|
||||
2BBA..2BBC ; Cn # [3] <reserved-2BBA>..<reserved-2BBC>
|
||||
2BC9 ; Cn # <reserved-2BC9>
|
||||
2BD3..2BEB ; Cn # [25] <reserved-2BD3>..<reserved-2BEB>
|
||||
2BF0..2BFF ; Cn # [16] <reserved-2BF0>..<reserved-2BFF>
|
||||
2BFF ; Cn # <reserved-2BFF>
|
||||
2C2F ; Cn # <reserved-2C2F>
|
||||
2C5F ; Cn # <reserved-2C5F>
|
||||
2CF4..2CF8 ; Cn # [5] <reserved-2CF4>..<reserved-2CF8>
|
||||
|
@ -300,7 +295,7 @@
|
|||
2DCF ; Cn # <reserved-2DCF>
|
||||
2DD7 ; Cn # <reserved-2DD7>
|
||||
2DDF ; Cn # <reserved-2DDF>
|
||||
2E4A..2E7F ; Cn # [54] <reserved-2E4A>..<reserved-2E7F>
|
||||
2E4F..2E7F ; Cn # [49] <reserved-2E4F>..<reserved-2E7F>
|
||||
2E9A ; Cn # <reserved-2E9A>
|
||||
2EF4..2EFF ; Cn # [12] <reserved-2EF4>..<reserved-2EFF>
|
||||
2FD6..2FEF ; Cn # [26] <reserved-2FD6>..<reserved-2FEF>
|
||||
|
@ -308,26 +303,24 @@
|
|||
3040 ; Cn # <reserved-3040>
|
||||
3097..3098 ; Cn # [2] <reserved-3097>..<reserved-3098>
|
||||
3100..3104 ; Cn # [5] <reserved-3100>..<reserved-3104>
|
||||
312F..3130 ; Cn # [2] <reserved-312F>..<reserved-3130>
|
||||
3130 ; Cn # <reserved-3130>
|
||||
318F ; Cn # <reserved-318F>
|
||||
31BB..31BF ; Cn # [5] <reserved-31BB>..<reserved-31BF>
|
||||
31E4..31EF ; Cn # [12] <reserved-31E4>..<reserved-31EF>
|
||||
321F ; Cn # <reserved-321F>
|
||||
32FF ; Cn # <reserved-32FF>
|
||||
4DB6..4DBF ; Cn # [10] <reserved-4DB6>..<reserved-4DBF>
|
||||
9FEB..9FFF ; Cn # [21] <reserved-9FEB>..<reserved-9FFF>
|
||||
9FF0..9FFF ; Cn # [16] <reserved-9FF0>..<reserved-9FFF>
|
||||
A48D..A48F ; Cn # [3] <reserved-A48D>..<reserved-A48F>
|
||||
A4C7..A4CF ; Cn # [9] <reserved-A4C7>..<reserved-A4CF>
|
||||
A62C..A63F ; Cn # [20] <reserved-A62C>..<reserved-A63F>
|
||||
A6F8..A6FF ; Cn # [8] <reserved-A6F8>..<reserved-A6FF>
|
||||
A7AF ; Cn # <reserved-A7AF>
|
||||
A7B8..A7F6 ; Cn # [63] <reserved-A7B8>..<reserved-A7F6>
|
||||
A7BA..A7F6 ; Cn # [61] <reserved-A7BA>..<reserved-A7F6>
|
||||
A82C..A82F ; Cn # [4] <reserved-A82C>..<reserved-A82F>
|
||||
A83A..A83F ; Cn # [6] <reserved-A83A>..<reserved-A83F>
|
||||
A878..A87F ; Cn # [8] <reserved-A878>..<reserved-A87F>
|
||||
A8C6..A8CD ; Cn # [8] <reserved-A8C6>..<reserved-A8CD>
|
||||
A8DA..A8DF ; Cn # [6] <reserved-A8DA>..<reserved-A8DF>
|
||||
A8FE..A8FF ; Cn # [2] <reserved-A8FE>..<reserved-A8FF>
|
||||
A954..A95E ; Cn # [11] <reserved-A954>..<reserved-A95E>
|
||||
A97D..A97F ; Cn # [3] <reserved-A97D>..<reserved-A97F>
|
||||
A9CE ; Cn # <reserved-A9CE>
|
||||
|
@ -429,9 +422,9 @@ FFFE..FFFF ; Cn # [2] <noncharacter-FFFE>..<noncharacter-FFFF>
|
|||
10A07..10A0B ; Cn # [5] <reserved-10A07>..<reserved-10A0B>
|
||||
10A14 ; Cn # <reserved-10A14>
|
||||
10A18 ; Cn # <reserved-10A18>
|
||||
10A34..10A37 ; Cn # [4] <reserved-10A34>..<reserved-10A37>
|
||||
10A36..10A37 ; Cn # [2] <reserved-10A36>..<reserved-10A37>
|
||||
10A3B..10A3E ; Cn # [4] <reserved-10A3B>..<reserved-10A3E>
|
||||
10A48..10A4F ; Cn # [8] <reserved-10A48>..<reserved-10A4F>
|
||||
10A49..10A4F ; Cn # [7] <reserved-10A49>..<reserved-10A4F>
|
||||
10A59..10A5F ; Cn # [7] <reserved-10A59>..<reserved-10A5F>
|
||||
10AA0..10ABF ; Cn # [32] <reserved-10AA0>..<reserved-10ABF>
|
||||
10AE7..10AEA ; Cn # [4] <reserved-10AE7>..<reserved-10AEA>
|
||||
|
@ -445,15 +438,19 @@ FFFE..FFFF ; Cn # [2] <noncharacter-FFFE>..<noncharacter-FFFF>
|
|||
10C49..10C7F ; Cn # [55] <reserved-10C49>..<reserved-10C7F>
|
||||
10CB3..10CBF ; Cn # [13] <reserved-10CB3>..<reserved-10CBF>
|
||||
10CF3..10CF9 ; Cn # [7] <reserved-10CF3>..<reserved-10CF9>
|
||||
10D00..10E5F ; Cn # [352] <reserved-10D00>..<reserved-10E5F>
|
||||
10E7F..10FFF ; Cn # [385] <reserved-10E7F>..<reserved-10FFF>
|
||||
10D28..10D2F ; Cn # [8] <reserved-10D28>..<reserved-10D2F>
|
||||
10D3A..10E5F ; Cn # [294] <reserved-10D3A>..<reserved-10E5F>
|
||||
10E7F..10EFF ; Cn # [129] <reserved-10E7F>..<reserved-10EFF>
|
||||
10F28..10F2F ; Cn # [8] <reserved-10F28>..<reserved-10F2F>
|
||||
10F5A..10FFF ; Cn # [166] <reserved-10F5A>..<reserved-10FFF>
|
||||
1104E..11051 ; Cn # [4] <reserved-1104E>..<reserved-11051>
|
||||
11070..1107E ; Cn # [15] <reserved-11070>..<reserved-1107E>
|
||||
110C2..110CF ; Cn # [14] <reserved-110C2>..<reserved-110CF>
|
||||
110C2..110CC ; Cn # [11] <reserved-110C2>..<reserved-110CC>
|
||||
110CE..110CF ; Cn # [2] <reserved-110CE>..<reserved-110CF>
|
||||
110E9..110EF ; Cn # [7] <reserved-110E9>..<reserved-110EF>
|
||||
110FA..110FF ; Cn # [6] <reserved-110FA>..<reserved-110FF>
|
||||
11135 ; Cn # <reserved-11135>
|
||||
11144..1114F ; Cn # [12] <reserved-11144>..<reserved-1114F>
|
||||
11147..1114F ; Cn # [9] <reserved-11147>..<reserved-1114F>
|
||||
11177..1117F ; Cn # [9] <reserved-11177>..<reserved-1117F>
|
||||
111CE..111CF ; Cn # [2] <reserved-111CE>..<reserved-111CF>
|
||||
111E0 ; Cn # <reserved-111E0>
|
||||
|
@ -473,7 +470,7 @@ FFFE..FFFF ; Cn # [2] <noncharacter-FFFE>..<noncharacter-FFFF>
|
|||
11329 ; Cn # <reserved-11329>
|
||||
11331 ; Cn # <reserved-11331>
|
||||
11334 ; Cn # <reserved-11334>
|
||||
1133A..1133B ; Cn # [2] <reserved-1133A>..<reserved-1133B>
|
||||
1133A ; Cn # <reserved-1133A>
|
||||
11345..11346 ; Cn # [2] <reserved-11345>..<reserved-11346>
|
||||
11349..1134A ; Cn # [2] <reserved-11349>..<reserved-1134A>
|
||||
1134E..1134F ; Cn # [2] <reserved-1134E>..<reserved-1134F>
|
||||
|
@ -484,7 +481,7 @@ FFFE..FFFF ; Cn # [2] <noncharacter-FFFE>..<noncharacter-FFFF>
|
|||
11375..113FF ; Cn # [139] <reserved-11375>..<reserved-113FF>
|
||||
1145A ; Cn # <reserved-1145A>
|
||||
1145C ; Cn # <reserved-1145C>
|
||||
1145E..1147F ; Cn # [34] <reserved-1145E>..<reserved-1147F>
|
||||
1145F..1147F ; Cn # [33] <reserved-1145F>..<reserved-1147F>
|
||||
114C8..114CF ; Cn # [8] <reserved-114C8>..<reserved-114CF>
|
||||
114DA..1157F ; Cn # [166] <reserved-114DA>..<reserved-1157F>
|
||||
115B6..115B7 ; Cn # [2] <reserved-115B6>..<reserved-115B7>
|
||||
|
@ -494,14 +491,14 @@ FFFE..FFFF ; Cn # [2] <noncharacter-FFFE>..<noncharacter-FFFF>
|
|||
1166D..1167F ; Cn # [19] <reserved-1166D>..<reserved-1167F>
|
||||
116B8..116BF ; Cn # [8] <reserved-116B8>..<reserved-116BF>
|
||||
116CA..116FF ; Cn # [54] <reserved-116CA>..<reserved-116FF>
|
||||
1171A..1171C ; Cn # [3] <reserved-1171A>..<reserved-1171C>
|
||||
1171B..1171C ; Cn # [2] <reserved-1171B>..<reserved-1171C>
|
||||
1172C..1172F ; Cn # [4] <reserved-1172C>..<reserved-1172F>
|
||||
11740..1189F ; Cn # [352] <reserved-11740>..<reserved-1189F>
|
||||
11740..117FF ; Cn # [192] <reserved-11740>..<reserved-117FF>
|
||||
1183C..1189F ; Cn # [100] <reserved-1183C>..<reserved-1189F>
|
||||
118F3..118FE ; Cn # [12] <reserved-118F3>..<reserved-118FE>
|
||||
11900..119FF ; Cn # [256] <reserved-11900>..<reserved-119FF>
|
||||
11A48..11A4F ; Cn # [8] <reserved-11A48>..<reserved-11A4F>
|
||||
11A84..11A85 ; Cn # [2] <reserved-11A84>..<reserved-11A85>
|
||||
11A9D ; Cn # <reserved-11A9D>
|
||||
11AA3..11ABF ; Cn # [29] <reserved-11AA3>..<reserved-11ABF>
|
||||
11AF9..11BFF ; Cn # [263] <reserved-11AF9>..<reserved-11BFF>
|
||||
11C09 ; Cn # <reserved-11C09>
|
||||
|
@ -517,7 +514,14 @@ FFFE..FFFF ; Cn # [2] <noncharacter-FFFE>..<noncharacter-FFFF>
|
|||
11D3B ; Cn # <reserved-11D3B>
|
||||
11D3E ; Cn # <reserved-11D3E>
|
||||
11D48..11D4F ; Cn # [8] <reserved-11D48>..<reserved-11D4F>
|
||||
11D5A..11FFF ; Cn # [678] <reserved-11D5A>..<reserved-11FFF>
|
||||
11D5A..11D5F ; Cn # [6] <reserved-11D5A>..<reserved-11D5F>
|
||||
11D66 ; Cn # <reserved-11D66>
|
||||
11D69 ; Cn # <reserved-11D69>
|
||||
11D8F ; Cn # <reserved-11D8F>
|
||||
11D92 ; Cn # <reserved-11D92>
|
||||
11D99..11D9F ; Cn # [7] <reserved-11D99>..<reserved-11D9F>
|
||||
11DAA..11EDF ; Cn # [310] <reserved-11DAA>..<reserved-11EDF>
|
||||
11EF9..11FFF ; Cn # [263] <reserved-11EF9>..<reserved-11FFF>
|
||||
1239A..123FF ; Cn # [102] <reserved-1239A>..<reserved-123FF>
|
||||
1246F ; Cn # <reserved-1246F>
|
||||
12475..1247F ; Cn # [11] <reserved-12475>..<reserved-1247F>
|
||||
|
@ -534,12 +538,13 @@ FFFE..FFFF ; Cn # [2] <noncharacter-FFFE>..<noncharacter-FFFF>
|
|||
16B5A ; Cn # <reserved-16B5A>
|
||||
16B62 ; Cn # <reserved-16B62>
|
||||
16B78..16B7C ; Cn # [5] <reserved-16B78>..<reserved-16B7C>
|
||||
16B90..16EFF ; Cn # [880] <reserved-16B90>..<reserved-16EFF>
|
||||
16B90..16E3F ; Cn # [688] <reserved-16B90>..<reserved-16E3F>
|
||||
16E9B..16EFF ; Cn # [101] <reserved-16E9B>..<reserved-16EFF>
|
||||
16F45..16F4F ; Cn # [11] <reserved-16F45>..<reserved-16F4F>
|
||||
16F7F..16F8E ; Cn # [16] <reserved-16F7F>..<reserved-16F8E>
|
||||
16FA0..16FDF ; Cn # [64] <reserved-16FA0>..<reserved-16FDF>
|
||||
16FE2..16FFF ; Cn # [30] <reserved-16FE2>..<reserved-16FFF>
|
||||
187ED..187FF ; Cn # [19] <reserved-187ED>..<reserved-187FF>
|
||||
187F2..187FF ; Cn # [14] <reserved-187F2>..<reserved-187FF>
|
||||
18AF3..1AFFF ; Cn # [9485] <reserved-18AF3>..<reserved-1AFFF>
|
||||
1B11F..1B16F ; Cn # [81] <reserved-1B11F>..<reserved-1B16F>
|
||||
1B2FC..1BBFF ; Cn # [2308] <reserved-1B2FC>..<reserved-1BBFF>
|
||||
|
@ -551,9 +556,10 @@ FFFE..FFFF ; Cn # [2] <noncharacter-FFFE>..<noncharacter-FFFF>
|
|||
1D0F6..1D0FF ; Cn # [10] <reserved-1D0F6>..<reserved-1D0FF>
|
||||
1D127..1D128 ; Cn # [2] <reserved-1D127>..<reserved-1D128>
|
||||
1D1E9..1D1FF ; Cn # [23] <reserved-1D1E9>..<reserved-1D1FF>
|
||||
1D246..1D2FF ; Cn # [186] <reserved-1D246>..<reserved-1D2FF>
|
||||
1D246..1D2DF ; Cn # [154] <reserved-1D246>..<reserved-1D2DF>
|
||||
1D2F4..1D2FF ; Cn # [12] <reserved-1D2F4>..<reserved-1D2FF>
|
||||
1D357..1D35F ; Cn # [9] <reserved-1D357>..<reserved-1D35F>
|
||||
1D372..1D3FF ; Cn # [142] <reserved-1D372>..<reserved-1D3FF>
|
||||
1D379..1D3FF ; Cn # [135] <reserved-1D379>..<reserved-1D3FF>
|
||||
1D455 ; Cn # <reserved-1D455>
|
||||
1D49D ; Cn # <reserved-1D49D>
|
||||
1D4A0..1D4A1 ; Cn # [2] <reserved-1D4A0>..<reserved-1D4A1>
|
||||
|
@ -586,7 +592,8 @@ FFFE..FFFF ; Cn # [2] <noncharacter-FFFE>..<noncharacter-FFFF>
|
|||
1E8D7..1E8FF ; Cn # [41] <reserved-1E8D7>..<reserved-1E8FF>
|
||||
1E94B..1E94F ; Cn # [5] <reserved-1E94B>..<reserved-1E94F>
|
||||
1E95A..1E95D ; Cn # [4] <reserved-1E95A>..<reserved-1E95D>
|
||||
1E960..1EDFF ; Cn # [1184] <reserved-1E960>..<reserved-1EDFF>
|
||||
1E960..1EC70 ; Cn # [785] <reserved-1E960>..<reserved-1EC70>
|
||||
1ECB5..1EDFF ; Cn # [331] <reserved-1ECB5>..<reserved-1EDFF>
|
||||
1EE04 ; Cn # <reserved-1EE04>
|
||||
1EE20 ; Cn # <reserved-1EE20>
|
||||
1EE23 ; Cn # <reserved-1EE23>
|
||||
|
@ -628,7 +635,6 @@ FFFE..FFFF ; Cn # [2] <noncharacter-FFFE>..<noncharacter-FFFF>
|
|||
1F0D0 ; Cn # <reserved-1F0D0>
|
||||
1F0F6..1F0FF ; Cn # [10] <reserved-1F0F6>..<reserved-1F0FF>
|
||||
1F10D..1F10F ; Cn # [3] <reserved-1F10D>..<reserved-1F10F>
|
||||
1F12F ; Cn # <reserved-1F12F>
|
||||
1F16C..1F16F ; Cn # [4] <reserved-1F16C>..<reserved-1F16F>
|
||||
1F1AD..1F1E5 ; Cn # [57] <reserved-1F1AD>..<reserved-1F1E5>
|
||||
1F203..1F20F ; Cn # [13] <reserved-1F203>..<reserved-1F20F>
|
||||
|
@ -638,9 +644,9 @@ FFFE..FFFF ; Cn # [2] <noncharacter-FFFE>..<noncharacter-FFFF>
|
|||
1F266..1F2FF ; Cn # [154] <reserved-1F266>..<reserved-1F2FF>
|
||||
1F6D5..1F6DF ; Cn # [11] <reserved-1F6D5>..<reserved-1F6DF>
|
||||
1F6ED..1F6EF ; Cn # [3] <reserved-1F6ED>..<reserved-1F6EF>
|
||||
1F6F9..1F6FF ; Cn # [7] <reserved-1F6F9>..<reserved-1F6FF>
|
||||
1F6FA..1F6FF ; Cn # [6] <reserved-1F6FA>..<reserved-1F6FF>
|
||||
1F774..1F77F ; Cn # [12] <reserved-1F774>..<reserved-1F77F>
|
||||
1F7D5..1F7FF ; Cn # [43] <reserved-1F7D5>..<reserved-1F7FF>
|
||||
1F7D9..1F7FF ; Cn # [39] <reserved-1F7D9>..<reserved-1F7FF>
|
||||
1F80C..1F80F ; Cn # [4] <reserved-1F80C>..<reserved-1F80F>
|
||||
1F848..1F84F ; Cn # [8] <reserved-1F848>..<reserved-1F84F>
|
||||
1F85A..1F85F ; Cn # [6] <reserved-1F85A>..<reserved-1F85F>
|
||||
|
@ -648,11 +654,14 @@ FFFE..FFFF ; Cn # [2] <noncharacter-FFFE>..<noncharacter-FFFF>
|
|||
1F8AE..1F8FF ; Cn # [82] <reserved-1F8AE>..<reserved-1F8FF>
|
||||
1F90C..1F90F ; Cn # [4] <reserved-1F90C>..<reserved-1F90F>
|
||||
1F93F ; Cn # <reserved-1F93F>
|
||||
1F94D..1F94F ; Cn # [3] <reserved-1F94D>..<reserved-1F94F>
|
||||
1F96C..1F97F ; Cn # [20] <reserved-1F96C>..<reserved-1F97F>
|
||||
1F998..1F9BF ; Cn # [40] <reserved-1F998>..<reserved-1F9BF>
|
||||
1F9C1..1F9CF ; Cn # [15] <reserved-1F9C1>..<reserved-1F9CF>
|
||||
1F9E7..1FFFF ; Cn # [1561] <reserved-1F9E7>..<noncharacter-1FFFF>
|
||||
1F971..1F972 ; Cn # [2] <reserved-1F971>..<reserved-1F972>
|
||||
1F977..1F979 ; Cn # [3] <reserved-1F977>..<reserved-1F979>
|
||||
1F97B ; Cn # <reserved-1F97B>
|
||||
1F9A3..1F9AF ; Cn # [13] <reserved-1F9A3>..<reserved-1F9AF>
|
||||
1F9BA..1F9BF ; Cn # [6] <reserved-1F9BA>..<reserved-1F9BF>
|
||||
1F9C3..1F9CF ; Cn # [13] <reserved-1F9C3>..<reserved-1F9CF>
|
||||
1FA00..1FA5F ; Cn # [96] <reserved-1FA00>..<reserved-1FA5F>
|
||||
1FA6E..1FFFF ; Cn # [1426] <reserved-1FA6E>..<noncharacter-1FFFF>
|
||||
2A6D7..2A6FF ; Cn # [41] <reserved-2A6D7>..<reserved-2A6FF>
|
||||
2B735..2B73F ; Cn # [11] <reserved-2B735>..<reserved-2B73F>
|
||||
2B81E..2B81F ; Cn # [2] <reserved-2B81E>..<reserved-2B81F>
|
||||
|
@ -665,7 +674,7 @@ E01F0..EFFFF ; Cn # [65040] <reserved-E01F0>..<noncharacter-EFFFF>
|
|||
FFFFE..FFFFF ; Cn # [2] <noncharacter-FFFFE>..<noncharacter-FFFFF>
|
||||
10FFFE..10FFFF; Cn # [2] <noncharacter-10FFFE>..<noncharacter-10FFFF>
|
||||
|
||||
# Total code points: 837841
|
||||
# Total code points: 837157
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -947,6 +956,8 @@ FFFFE..FFFFF ; Cn # [2] <noncharacter-FFFFE>..<noncharacter-FFFFF>
|
|||
10C7 ; Lu # GEORGIAN CAPITAL LETTER YN
|
||||
10CD ; Lu # GEORGIAN CAPITAL LETTER AEN
|
||||
13A0..13F5 ; Lu # [86] CHEROKEE LETTER A..CHEROKEE LETTER MV
|
||||
1C90..1CBA ; Lu # [43] GEORGIAN MTAVRULI CAPITAL LETTER AN..GEORGIAN MTAVRULI CAPITAL LETTER AIN
|
||||
1CBD..1CBF ; Lu # [3] GEORGIAN MTAVRULI CAPITAL LETTER AEN..GEORGIAN MTAVRULI CAPITAL LETTER LABIAL SIGN
|
||||
1E00 ; Lu # LATIN CAPITAL LETTER A WITH RING BELOW
|
||||
1E02 ; Lu # LATIN CAPITAL LETTER B WITH DOT ABOVE
|
||||
1E04 ; Lu # LATIN CAPITAL LETTER B WITH DOT BELOW
|
||||
|
@ -1261,11 +1272,13 @@ A7A8 ; Lu # LATIN CAPITAL LETTER S WITH OBLIQUE STROKE
|
|||
A7AA..A7AE ; Lu # [5] LATIN CAPITAL LETTER H WITH HOOK..LATIN CAPITAL LETTER SMALL CAPITAL I
|
||||
A7B0..A7B4 ; Lu # [5] LATIN CAPITAL LETTER TURNED K..LATIN CAPITAL LETTER BETA
|
||||
A7B6 ; Lu # LATIN CAPITAL LETTER OMEGA
|
||||
A7B8 ; Lu # LATIN CAPITAL LETTER U WITH STROKE
|
||||
FF21..FF3A ; Lu # [26] FULLWIDTH LATIN CAPITAL LETTER A..FULLWIDTH LATIN CAPITAL LETTER Z
|
||||
10400..10427 ; Lu # [40] DESERET CAPITAL LETTER LONG I..DESERET CAPITAL LETTER EW
|
||||
104B0..104D3 ; Lu # [36] OSAGE CAPITAL LETTER A..OSAGE CAPITAL LETTER ZHA
|
||||
10C80..10CB2 ; Lu # [51] OLD HUNGARIAN CAPITAL LETTER A..OLD HUNGARIAN CAPITAL LETTER US
|
||||
118A0..118BF ; Lu # [32] WARANG CITI CAPITAL LETTER NGAA..WARANG CITI CAPITAL LETTER VIYO
|
||||
16E40..16E5F ; Lu # [32] MEDEFAIDRIN CAPITAL LETTER M..MEDEFAIDRIN CAPITAL LETTER Y
|
||||
1D400..1D419 ; Lu # [26] MATHEMATICAL BOLD CAPITAL A..MATHEMATICAL BOLD CAPITAL Z
|
||||
1D434..1D44D ; Lu # [26] MATHEMATICAL ITALIC CAPITAL A..MATHEMATICAL ITALIC CAPITAL Z
|
||||
1D468..1D481 ; Lu # [26] MATHEMATICAL BOLD ITALIC CAPITAL A..MATHEMATICAL BOLD ITALIC CAPITAL Z
|
||||
|
@ -1299,7 +1312,7 @@ FF21..FF3A ; Lu # [26] FULLWIDTH LATIN CAPITAL LETTER A..FULLWIDTH LATIN CAP
|
|||
1D7CA ; Lu # MATHEMATICAL BOLD CAPITAL DIGAMMA
|
||||
1E900..1E921 ; Lu # [34] ADLAM CAPITAL LETTER ALIF..ADLAM CAPITAL LETTER SHA
|
||||
|
||||
# Total code points: 1702
|
||||
# Total code points: 1781
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -1574,7 +1587,9 @@ FF21..FF3A ; Lu # [26] FULLWIDTH LATIN CAPITAL LETTER A..FULLWIDTH LATIN CAP
|
|||
052B ; Ll # CYRILLIC SMALL LETTER DZZHE
|
||||
052D ; Ll # CYRILLIC SMALL LETTER DCHE
|
||||
052F ; Ll # CYRILLIC SMALL LETTER EL WITH DESCENDER
|
||||
0561..0587 ; Ll # [39] ARMENIAN SMALL LETTER AYB..ARMENIAN SMALL LIGATURE ECH YIWN
|
||||
0560..0588 ; Ll # [41] ARMENIAN SMALL LETTER TURNED AYB..ARMENIAN SMALL LETTER YI WITH STROKE
|
||||
10D0..10FA ; Ll # [43] GEORGIAN LETTER AN..GEORGIAN LETTER AIN
|
||||
10FD..10FF ; Ll # [3] GEORGIAN LETTER AEN..GEORGIAN LETTER LABIAL SIGN
|
||||
13F8..13FD ; Ll # [6] CHEROKEE SMALL LETTER YE..CHEROKEE SMALL LETTER MV
|
||||
1C80..1C88 ; Ll # [9] CYRILLIC SMALL LETTER ROUNDED VE..CYRILLIC SMALL LETTER UNBLENDED UK
|
||||
1D00..1D2B ; Ll # [44] LATIN LETTER SMALL CAPITAL A..CYRILLIC LETTER SMALL CAPITAL EL
|
||||
|
@ -1896,8 +1911,10 @@ A7A3 ; Ll # LATIN SMALL LETTER K WITH OBLIQUE STROKE
|
|||
A7A5 ; Ll # LATIN SMALL LETTER N WITH OBLIQUE STROKE
|
||||
A7A7 ; Ll # LATIN SMALL LETTER R WITH OBLIQUE STROKE
|
||||
A7A9 ; Ll # LATIN SMALL LETTER S WITH OBLIQUE STROKE
|
||||
A7AF ; Ll # LATIN LETTER SMALL CAPITAL Q
|
||||
A7B5 ; Ll # LATIN SMALL LETTER BETA
|
||||
A7B7 ; Ll # LATIN SMALL LETTER OMEGA
|
||||
A7B9 ; Ll # LATIN SMALL LETTER U WITH STROKE
|
||||
A7FA ; Ll # LATIN LETTER SMALL CAPITAL TURNED M
|
||||
AB30..AB5A ; Ll # [43] LATIN SMALL LETTER BARRED ALPHA..LATIN SMALL LETTER Y WITH SHORT RIGHT LEG
|
||||
AB60..AB65 ; Ll # [6] LATIN SMALL LETTER SAKHA YAT..GREEK LETTER SMALL CAPITAL OMEGA
|
||||
|
@ -1909,6 +1926,7 @@ FF41..FF5A ; Ll # [26] FULLWIDTH LATIN SMALL LETTER A..FULLWIDTH LATIN SMALL
|
|||
104D8..104FB ; Ll # [36] OSAGE SMALL LETTER A..OSAGE SMALL LETTER ZHA
|
||||
10CC0..10CF2 ; Ll # [51] OLD HUNGARIAN SMALL LETTER A..OLD HUNGARIAN SMALL LETTER US
|
||||
118C0..118DF ; Ll # [32] WARANG CITI SMALL LETTER NGAA..WARANG CITI SMALL LETTER VIYO
|
||||
16E60..16E7F ; Ll # [32] MEDEFAIDRIN SMALL LETTER M..MEDEFAIDRIN SMALL LETTER Y
|
||||
1D41A..1D433 ; Ll # [26] MATHEMATICAL BOLD SMALL A..MATHEMATICAL BOLD SMALL Z
|
||||
1D44E..1D454 ; Ll # [7] MATHEMATICAL ITALIC SMALL A..MATHEMATICAL ITALIC SMALL G
|
||||
1D456..1D467 ; Ll # [18] MATHEMATICAL ITALIC SMALL I..MATHEMATICAL ITALIC SMALL Z
|
||||
|
@ -1939,7 +1957,7 @@ FF41..FF5A ; Ll # [26] FULLWIDTH LATIN SMALL LETTER A..FULLWIDTH LATIN SMALL
|
|||
1D7CB ; Ll # MATHEMATICAL BOLD SMALL DIGAMMA
|
||||
1E922..1E943 ; Ll # [34] ADLAM SMALL LETTER ALIF..ADLAM SMALL LETTER SHA
|
||||
|
||||
# Total code points: 2063
|
||||
# Total code points: 2145
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -2032,7 +2050,7 @@ FF9E..FF9F ; Lm # [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDTH KATAK
|
|||
01C0..01C3 ; Lo # [4] LATIN LETTER DENTAL CLICK..LATIN LETTER RETROFLEX CLICK
|
||||
0294 ; Lo # LATIN LETTER GLOTTAL STOP
|
||||
05D0..05EA ; Lo # [27] HEBREW LETTER ALEF..HEBREW LETTER TAV
|
||||
05F0..05F2 ; Lo # [3] HEBREW LIGATURE YIDDISH DOUBLE VAV..HEBREW LIGATURE YIDDISH DOUBLE YOD
|
||||
05EF..05F2 ; Lo # [4] HEBREW YOD TRIANGLE..HEBREW LIGATURE YIDDISH DOUBLE YOD
|
||||
0620..063F ; Lo # [32] ARABIC LETTER KASHMIRI YEH..ARABIC LETTER FARSI YEH WITH THREE DOTS ABOVE
|
||||
0641..064A ; Lo # [10] ARABIC LETTER FEH..ARABIC LETTER YEH
|
||||
066E..066F ; Lo # [2] ARABIC LETTER DOTLESS BEH..ARABIC LETTER DOTLESS QAF
|
||||
|
@ -2171,8 +2189,7 @@ FF9E..FF9F ; Lm # [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDTH KATAK
|
|||
106E..1070 ; Lo # [3] MYANMAR LETTER EASTERN PWO KAREN NNA..MYANMAR LETTER EASTERN PWO KAREN GHWA
|
||||
1075..1081 ; Lo # [13] MYANMAR LETTER SHAN KA..MYANMAR LETTER SHAN HA
|
||||
108E ; Lo # MYANMAR LETTER RUMAI PALAUNG FA
|
||||
10D0..10FA ; Lo # [43] GEORGIAN LETTER AN..GEORGIAN LETTER AIN
|
||||
10FD..1248 ; Lo # [332] GEORGIAN LETTER AEN..ETHIOPIC SYLLABLE QWA
|
||||
1100..1248 ; Lo # [329] HANGUL CHOSEONG KIYEOK..ETHIOPIC SYLLABLE QWA
|
||||
124A..124D ; Lo # [4] ETHIOPIC SYLLABLE QWI..ETHIOPIC SYLLABLE QWE
|
||||
1250..1256 ; Lo # [7] ETHIOPIC SYLLABLE QHA..ETHIOPIC SYLLABLE QHO
|
||||
1258 ; Lo # ETHIOPIC SYLLABLE QHWA
|
||||
|
@ -2203,7 +2220,7 @@ FF9E..FF9F ; Lm # [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDTH KATAK
|
|||
1780..17B3 ; Lo # [52] KHMER LETTER KA..KHMER INDEPENDENT VOWEL QAU
|
||||
17DC ; Lo # KHMER SIGN AVAKRAHASANYA
|
||||
1820..1842 ; Lo # [35] MONGOLIAN LETTER A..MONGOLIAN LETTER CHI
|
||||
1844..1877 ; Lo # [52] MONGOLIAN LETTER TODO E..MONGOLIAN LETTER MANCHU ZHA
|
||||
1844..1878 ; Lo # [53] MONGOLIAN LETTER TODO E..MONGOLIAN LETTER CHA WITH TWO DOTS
|
||||
1880..1884 ; Lo # [5] MONGOLIAN LETTER ALI GALI ANUSVARA ONE..MONGOLIAN LETTER ALI GALI INVERTED UBADAMA
|
||||
1887..18A8 ; Lo # [34] MONGOLIAN LETTER ALI GALI A..MONGOLIAN LETTER MANCHU ALI GALI BHA
|
||||
18AA ; Lo # MONGOLIAN LETTER MANCHU ALI GALI LHA
|
||||
|
@ -2243,12 +2260,12 @@ FF9E..FF9F ; Lm # [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDTH KATAK
|
|||
309F ; Lo # HIRAGANA DIGRAPH YORI
|
||||
30A1..30FA ; Lo # [90] KATAKANA LETTER SMALL A..KATAKANA LETTER VO
|
||||
30FF ; Lo # KATAKANA DIGRAPH KOTO
|
||||
3105..312E ; Lo # [42] BOPOMOFO LETTER B..BOPOMOFO LETTER O WITH DOT ABOVE
|
||||
3105..312F ; Lo # [43] BOPOMOFO LETTER B..BOPOMOFO LETTER NN
|
||||
3131..318E ; Lo # [94] HANGUL LETTER KIYEOK..HANGUL LETTER ARAEAE
|
||||
31A0..31BA ; Lo # [27] BOPOMOFO LETTER BU..BOPOMOFO LETTER ZY
|
||||
31F0..31FF ; Lo # [16] KATAKANA LETTER SMALL KU..KATAKANA LETTER SMALL RO
|
||||
3400..4DB5 ; Lo # [6582] CJK UNIFIED IDEOGRAPH-3400..CJK UNIFIED IDEOGRAPH-4DB5
|
||||
4E00..9FEA ; Lo # [20971] CJK UNIFIED IDEOGRAPH-4E00..CJK UNIFIED IDEOGRAPH-9FEA
|
||||
4E00..9FEF ; Lo # [20976] CJK UNIFIED IDEOGRAPH-4E00..CJK UNIFIED IDEOGRAPH-9FEF
|
||||
A000..A014 ; Lo # [21] YI SYLLABLE IT..YI SYLLABLE E
|
||||
A016..A48C ; Lo # [1143] YI SYLLABLE BIT..YI SYLLABLE YYR
|
||||
A4D0..A4F7 ; Lo # [40] LISU LETTER BA..LISU LETTER OE
|
||||
|
@ -2267,7 +2284,7 @@ A840..A873 ; Lo # [52] PHAGS-PA LETTER KA..PHAGS-PA LETTER CANDRABINDU
|
|||
A882..A8B3 ; Lo # [50] SAURASHTRA LETTER A..SAURASHTRA LETTER LLA
|
||||
A8F2..A8F7 ; Lo # [6] DEVANAGARI SIGN SPACING CANDRABINDU..DEVANAGARI SIGN CANDRABINDU AVAGRAHA
|
||||
A8FB ; Lo # DEVANAGARI HEADSTROKE
|
||||
A8FD ; Lo # DEVANAGARI JAIN OM
|
||||
A8FD..A8FE ; Lo # [2] DEVANAGARI JAIN OM..DEVANAGARI LETTER AY
|
||||
A90A..A925 ; Lo # [28] KAYAH LI LETTER KA..KAYAH LI LETTER OO
|
||||
A930..A946 ; Lo # [23] REJANG LETTER KA..REJANG LETTER A
|
||||
A960..A97C ; Lo # [29] HANGUL CHOSEONG TIKEUT-MIEUM..HANGUL CHOSEONG SSANGYEORINHIEUH
|
||||
|
@ -2361,7 +2378,7 @@ FFDA..FFDC ; Lo # [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL LETTER I
|
|||
10A00 ; Lo # KHAROSHTHI LETTER A
|
||||
10A10..10A13 ; Lo # [4] KHAROSHTHI LETTER KA..KHAROSHTHI LETTER GHA
|
||||
10A15..10A17 ; Lo # [3] KHAROSHTHI LETTER CA..KHAROSHTHI LETTER JA
|
||||
10A19..10A33 ; Lo # [27] KHAROSHTHI LETTER NYA..KHAROSHTHI LETTER TTTHA
|
||||
10A19..10A35 ; Lo # [29] KHAROSHTHI LETTER NYA..KHAROSHTHI LETTER VHA
|
||||
10A60..10A7C ; Lo # [29] OLD SOUTH ARABIAN LETTER HE..OLD SOUTH ARABIAN LETTER THETH
|
||||
10A80..10A9C ; Lo # [29] OLD NORTH ARABIAN LETTER HEH..OLD NORTH ARABIAN LETTER ZAH
|
||||
10AC0..10AC7 ; Lo # [8] MANICHAEAN LETTER ALEPH..MANICHAEAN LETTER WAW
|
||||
|
@ -2371,10 +2388,15 @@ FFDA..FFDC ; Lo # [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL LETTER I
|
|||
10B60..10B72 ; Lo # [19] INSCRIPTIONAL PAHLAVI LETTER ALEPH..INSCRIPTIONAL PAHLAVI LETTER TAW
|
||||
10B80..10B91 ; Lo # [18] PSALTER PAHLAVI LETTER ALEPH..PSALTER PAHLAVI LETTER TAW
|
||||
10C00..10C48 ; Lo # [73] OLD TURKIC LETTER ORKHON A..OLD TURKIC LETTER ORKHON BASH
|
||||
10D00..10D23 ; Lo # [36] HANIFI ROHINGYA LETTER A..HANIFI ROHINGYA MARK NA KHONNA
|
||||
10F00..10F1C ; Lo # [29] OLD SOGDIAN LETTER ALEPH..OLD SOGDIAN LETTER FINAL TAW WITH VERTICAL TAIL
|
||||
10F27 ; Lo # OLD SOGDIAN LIGATURE AYIN-DALETH
|
||||
10F30..10F45 ; Lo # [22] SOGDIAN LETTER ALEPH..SOGDIAN INDEPENDENT SHIN
|
||||
11003..11037 ; Lo # [53] BRAHMI SIGN JIHVAMULIYA..BRAHMI LETTER OLD TAMIL NNNA
|
||||
11083..110AF ; Lo # [45] KAITHI LETTER A..KAITHI LETTER HA
|
||||
110D0..110E8 ; Lo # [25] SORA SOMPENG LETTER SAH..SORA SOMPENG LETTER MAE
|
||||
11103..11126 ; Lo # [36] CHAKMA LETTER AA..CHAKMA LETTER HAA
|
||||
11144 ; Lo # CHAKMA LETTER LHAA
|
||||
11150..11172 ; Lo # [35] MAHAJANI LETTER A..MAHAJANI LETTER RRA
|
||||
11176 ; Lo # MAHAJANI LIGATURE SHRI
|
||||
11183..111B2 ; Lo # [48] SHARADA LETTER A..SHARADA LETTER HA
|
||||
|
@ -2408,7 +2430,8 @@ FFDA..FFDC ; Lo # [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL LETTER I
|
|||
11600..1162F ; Lo # [48] MODI LETTER A..MODI LETTER LLA
|
||||
11644 ; Lo # MODI SIGN HUVA
|
||||
11680..116AA ; Lo # [43] TAKRI LETTER A..TAKRI LETTER RRA
|
||||
11700..11719 ; Lo # [26] AHOM LETTER KA..AHOM LETTER JHA
|
||||
11700..1171A ; Lo # [27] AHOM LETTER KA..AHOM LETTER ALTERNATE BA
|
||||
11800..1182B ; Lo # [44] DOGRA LETTER A..DOGRA LETTER RRA
|
||||
118FF ; Lo # WARANG CITI OM
|
||||
11A00 ; Lo # ZANABAZAR SQUARE LETTER A
|
||||
11A0B..11A32 ; Lo # [40] ZANABAZAR SQUARE LETTER KA..ZANABAZAR SQUARE LETTER KSSA
|
||||
|
@ -2416,6 +2439,7 @@ FFDA..FFDC ; Lo # [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL LETTER I
|
|||
11A50 ; Lo # SOYOMBO LETTER A
|
||||
11A5C..11A83 ; Lo # [40] SOYOMBO LETTER KA..SOYOMBO LETTER KSSA
|
||||
11A86..11A89 ; Lo # [4] SOYOMBO CLUSTER-INITIAL LETTER RA..SOYOMBO CLUSTER-INITIAL LETTER SA
|
||||
11A9D ; Lo # SOYOMBO MARK PLUTA
|
||||
11AC0..11AF8 ; Lo # [57] PAU CIN HAU LETTER PA..PAU CIN HAU GLOTTAL STOP FINAL
|
||||
11C00..11C08 ; Lo # [9] BHAIKSUKI LETTER A..BHAIKSUKI LETTER VOCALIC L
|
||||
11C0A..11C2E ; Lo # [37] BHAIKSUKI LETTER E..BHAIKSUKI LETTER HA
|
||||
|
@ -2425,6 +2449,11 @@ FFDA..FFDC ; Lo # [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL LETTER I
|
|||
11D08..11D09 ; Lo # [2] MASARAM GONDI LETTER AI..MASARAM GONDI LETTER O
|
||||
11D0B..11D30 ; Lo # [38] MASARAM GONDI LETTER AU..MASARAM GONDI LETTER TRA
|
||||
11D46 ; Lo # MASARAM GONDI REPHA
|
||||
11D60..11D65 ; Lo # [6] GUNJALA GONDI LETTER A..GUNJALA GONDI LETTER UU
|
||||
11D67..11D68 ; Lo # [2] GUNJALA GONDI LETTER EE..GUNJALA GONDI LETTER AI
|
||||
11D6A..11D89 ; Lo # [32] GUNJALA GONDI LETTER OO..GUNJALA GONDI LETTER SA
|
||||
11D98 ; Lo # GUNJALA GONDI OM
|
||||
11EE0..11EF2 ; Lo # [19] MAKASAR LETTER KA..MAKASAR ANGKA
|
||||
12000..12399 ; Lo # [922] CUNEIFORM SIGN A..CUNEIFORM SIGN U U
|
||||
12480..12543 ; Lo # [196] CUNEIFORM SIGN AB TIMES NUN TENU..CUNEIFORM SIGN ZU5 TIMES THREE DISH TENU
|
||||
13000..1342E ; Lo # [1071] EGYPTIAN HIEROGLYPH A001..EGYPTIAN HIEROGLYPH AA032
|
||||
|
@ -2437,7 +2466,7 @@ FFDA..FFDC ; Lo # [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL LETTER I
|
|||
16B7D..16B8F ; Lo # [19] PAHAWH HMONG CLAN SIGN TSHEEJ..PAHAWH HMONG CLAN SIGN VWJ
|
||||
16F00..16F44 ; Lo # [69] MIAO LETTER PA..MIAO LETTER HHA
|
||||
16F50 ; Lo # MIAO LETTER NASALIZATION
|
||||
17000..187EC ; Lo # [6125] TANGUT IDEOGRAPH-17000..TANGUT IDEOGRAPH-187EC
|
||||
17000..187F1 ; Lo # [6130] TANGUT IDEOGRAPH-17000..TANGUT IDEOGRAPH-187F1
|
||||
18800..18AF2 ; Lo # [755] TANGUT COMPONENT-001..TANGUT COMPONENT-755
|
||||
1B000..1B11E ; Lo # [287] KATAKANA LETTER ARCHAIC E..HENTAIGANA LETTER N-MU-MO-2
|
||||
1B170..1B2FB ; Lo # [396] NUSHU CHARACTER-1B170..NUSHU CHARACTER-1B2FB
|
||||
|
@ -2486,7 +2515,7 @@ FFDA..FFDC ; Lo # [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL LETTER I
|
|||
2CEB0..2EBE0 ; Lo # [7473] CJK UNIFIED IDEOGRAPH-2CEB0..CJK UNIFIED IDEOGRAPH-2EBE0
|
||||
2F800..2FA1D ; Lo # [542] CJK COMPATIBILITY IDEOGRAPH-2F800..CJK COMPATIBILITY IDEOGRAPH-2FA1D
|
||||
|
||||
# Total code points: 121047
|
||||
# Total code points: 121212
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -2510,12 +2539,13 @@ FFDA..FFDC ; Lo # [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL LETTER I
|
|||
0730..074A ; Mn # [27] SYRIAC PTHAHA ABOVE..SYRIAC BARREKH
|
||||
07A6..07B0 ; Mn # [11] THAANA ABAFILI..THAANA SUKUN
|
||||
07EB..07F3 ; Mn # [9] NKO COMBINING SHORT HIGH TONE..NKO COMBINING DOUBLE DOT ABOVE
|
||||
07FD ; Mn # NKO DANTAYALAN
|
||||
0816..0819 ; Mn # [4] SAMARITAN MARK IN..SAMARITAN MARK DAGESH
|
||||
081B..0823 ; Mn # [9] SAMARITAN MARK EPENTHETIC YUT..SAMARITAN VOWEL SIGN A
|
||||
0825..0827 ; Mn # [3] SAMARITAN VOWEL SIGN SHORT A..SAMARITAN VOWEL SIGN U
|
||||
0829..082D ; Mn # [5] SAMARITAN VOWEL SIGN LONG I..SAMARITAN MARK NEQUDAA
|
||||
0859..085B ; Mn # [3] MANDAIC AFFRICATION MARK..MANDAIC GEMINATION MARK
|
||||
08D4..08E1 ; Mn # [14] ARABIC SMALL HIGH WORD AR-RUB..ARABIC SMALL HIGH SIGN SAFHA
|
||||
08D3..08E1 ; Mn # [15] ARABIC SMALL LOW WAW..ARABIC SMALL HIGH SIGN SAFHA
|
||||
08E3..0902 ; Mn # [32] ARABIC TURNED DAMMA BELOW..DEVANAGARI SIGN ANUSVARA
|
||||
093A ; Mn # DEVANAGARI VOWEL SIGN OE
|
||||
093C ; Mn # DEVANAGARI SIGN NUKTA
|
||||
|
@ -2528,6 +2558,7 @@ FFDA..FFDC ; Lo # [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL LETTER I
|
|||
09C1..09C4 ; Mn # [4] BENGALI VOWEL SIGN U..BENGALI VOWEL SIGN VOCALIC RR
|
||||
09CD ; Mn # BENGALI SIGN VIRAMA
|
||||
09E2..09E3 ; Mn # [2] BENGALI VOWEL SIGN VOCALIC L..BENGALI VOWEL SIGN VOCALIC LL
|
||||
09FE ; Mn # BENGALI SANDHI MARK
|
||||
0A01..0A02 ; Mn # [2] GURMUKHI SIGN ADAK BINDI..GURMUKHI SIGN BINDI
|
||||
0A3C ; Mn # GURMUKHI SIGN NUKTA
|
||||
0A41..0A42 ; Mn # [2] GURMUKHI VOWEL SIGN U..GURMUKHI VOWEL SIGN UU
|
||||
|
@ -2554,6 +2585,7 @@ FFDA..FFDC ; Lo # [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL LETTER I
|
|||
0BC0 ; Mn # TAMIL VOWEL SIGN II
|
||||
0BCD ; Mn # TAMIL SIGN VIRAMA
|
||||
0C00 ; Mn # TELUGU SIGN COMBINING CANDRABINDU ABOVE
|
||||
0C04 ; Mn # TELUGU SIGN COMBINING ANUSVARA ABOVE
|
||||
0C3E..0C40 ; Mn # [3] TELUGU VOWEL SIGN AA..TELUGU VOWEL SIGN II
|
||||
0C46..0C48 ; Mn # [3] TELUGU VOWEL SIGN E..TELUGU VOWEL SIGN AI
|
||||
0C4A..0C4D ; Mn # [4] TELUGU VOWEL SIGN O..TELUGU SIGN VIRAMA
|
||||
|
@ -2670,6 +2702,7 @@ A80B ; Mn # SYLOTI NAGRI SIGN ANUSVARA
|
|||
A825..A826 ; Mn # [2] SYLOTI NAGRI VOWEL SIGN U..SYLOTI NAGRI VOWEL SIGN E
|
||||
A8C4..A8C5 ; Mn # [2] SAURASHTRA SIGN VIRAMA..SAURASHTRA SIGN CANDRABINDU
|
||||
A8E0..A8F1 ; Mn # [18] COMBINING DEVANAGARI DIGIT ZERO..COMBINING DEVANAGARI SIGN AVAGRAHA
|
||||
A8FF ; Mn # DEVANAGARI VOWEL SIGN AY
|
||||
A926..A92D ; Mn # [8] KAYAH LI VOWEL UE..KAYAH LI TONE CALYA PLOPHU
|
||||
A947..A951 ; Mn # [11] REJANG VOWEL SIGN I..REJANG CONSONANT SIGN R
|
||||
A980..A982 ; Mn # [3] JAVANESE SIGN PANYANGGA..JAVANESE SIGN LAYAR
|
||||
|
@ -2705,6 +2738,8 @@ FE20..FE2F ; Mn # [16] COMBINING LIGATURE LEFT HALF..COMBINING CYRILLIC TITL
|
|||
10A38..10A3A ; Mn # [3] KHAROSHTHI SIGN BAR ABOVE..KHAROSHTHI SIGN DOT BELOW
|
||||
10A3F ; Mn # KHAROSHTHI VIRAMA
|
||||
10AE5..10AE6 ; Mn # [2] MANICHAEAN ABBREVIATION MARK ABOVE..MANICHAEAN ABBREVIATION MARK BELOW
|
||||
10D24..10D27 ; Mn # [4] HANIFI ROHINGYA SIGN HARBAHAY..HANIFI ROHINGYA SIGN TASSI
|
||||
10F46..10F50 ; Mn # [11] SOGDIAN COMBINING DOT BELOW..SOGDIAN COMBINING STROKE BELOW
|
||||
11001 ; Mn # BRAHMI SIGN ANUSVARA
|
||||
11038..11046 ; Mn # [15] BRAHMI VOWEL SIGN AA..BRAHMI VIRAMA
|
||||
1107F..11081 ; Mn # [3] BRAHMI NUMBER JOINER..KAITHI SIGN ANUSVARA
|
||||
|
@ -2716,7 +2751,7 @@ FE20..FE2F ; Mn # [16] COMBINING LIGATURE LEFT HALF..COMBINING CYRILLIC TITL
|
|||
11173 ; Mn # MAHAJANI SIGN NUKTA
|
||||
11180..11181 ; Mn # [2] SHARADA SIGN CANDRABINDU..SHARADA SIGN ANUSVARA
|
||||
111B6..111BE ; Mn # [9] SHARADA VOWEL SIGN U..SHARADA VOWEL SIGN O
|
||||
111CA..111CC ; Mn # [3] SHARADA SIGN NUKTA..SHARADA EXTRA SHORT VOWEL MARK
|
||||
111C9..111CC ; Mn # [4] SHARADA SANDHI MARK..SHARADA EXTRA SHORT VOWEL MARK
|
||||
1122F..11231 ; Mn # [3] KHOJKI VOWEL SIGN U..KHOJKI VOWEL SIGN AI
|
||||
11234 ; Mn # KHOJKI SIGN ANUSVARA
|
||||
11236..11237 ; Mn # [2] KHOJKI SIGN NUKTA..KHOJKI SIGN SHADDA
|
||||
|
@ -2724,13 +2759,14 @@ FE20..FE2F ; Mn # [16] COMBINING LIGATURE LEFT HALF..COMBINING CYRILLIC TITL
|
|||
112DF ; Mn # KHUDAWADI SIGN ANUSVARA
|
||||
112E3..112EA ; Mn # [8] KHUDAWADI VOWEL SIGN U..KHUDAWADI SIGN VIRAMA
|
||||
11300..11301 ; Mn # [2] GRANTHA SIGN COMBINING ANUSVARA ABOVE..GRANTHA SIGN CANDRABINDU
|
||||
1133C ; Mn # GRANTHA SIGN NUKTA
|
||||
1133B..1133C ; Mn # [2] COMBINING BINDU BELOW..GRANTHA SIGN NUKTA
|
||||
11340 ; Mn # GRANTHA VOWEL SIGN II
|
||||
11366..1136C ; Mn # [7] COMBINING GRANTHA DIGIT ZERO..COMBINING GRANTHA DIGIT SIX
|
||||
11370..11374 ; Mn # [5] COMBINING GRANTHA LETTER A..COMBINING GRANTHA LETTER PA
|
||||
11438..1143F ; Mn # [8] NEWA VOWEL SIGN U..NEWA VOWEL SIGN AI
|
||||
11442..11444 ; Mn # [3] NEWA SIGN VIRAMA..NEWA SIGN ANUSVARA
|
||||
11446 ; Mn # NEWA SIGN NUKTA
|
||||
1145E ; Mn # NEWA SANDHI MARK
|
||||
114B3..114B8 ; Mn # [6] TIRHUTA VOWEL SIGN U..TIRHUTA VOWEL SIGN VOCALIC LL
|
||||
114BA ; Mn # TIRHUTA VOWEL SIGN SHORT E
|
||||
114BF..114C0 ; Mn # [2] TIRHUTA SIGN CANDRABINDU..TIRHUTA SIGN ANUSVARA
|
||||
|
@ -2749,8 +2785,9 @@ FE20..FE2F ; Mn # [16] COMBINING LIGATURE LEFT HALF..COMBINING CYRILLIC TITL
|
|||
1171D..1171F ; Mn # [3] AHOM CONSONANT SIGN MEDIAL LA..AHOM CONSONANT SIGN MEDIAL LIGATING RA
|
||||
11722..11725 ; Mn # [4] AHOM VOWEL SIGN I..AHOM VOWEL SIGN UU
|
||||
11727..1172B ; Mn # [5] AHOM VOWEL SIGN AW..AHOM SIGN KILLER
|
||||
11A01..11A06 ; Mn # [6] ZANABAZAR SQUARE VOWEL SIGN I..ZANABAZAR SQUARE VOWEL SIGN O
|
||||
11A09..11A0A ; Mn # [2] ZANABAZAR SQUARE VOWEL SIGN REVERSED I..ZANABAZAR SQUARE VOWEL LENGTH MARK
|
||||
1182F..11837 ; Mn # [9] DOGRA VOWEL SIGN U..DOGRA SIGN ANUSVARA
|
||||
11839..1183A ; Mn # [2] DOGRA SIGN VIRAMA..DOGRA SIGN NUKTA
|
||||
11A01..11A0A ; Mn # [10] ZANABAZAR SQUARE VOWEL SIGN I..ZANABAZAR SQUARE VOWEL LENGTH MARK
|
||||
11A33..11A38 ; Mn # [6] ZANABAZAR SQUARE FINAL CONSONANT MARK..ZANABAZAR SQUARE SIGN ANUSVARA
|
||||
11A3B..11A3E ; Mn # [4] ZANABAZAR SQUARE CLUSTER-FINAL LETTER YA..ZANABAZAR SQUARE CLUSTER-FINAL LETTER VA
|
||||
11A47 ; Mn # ZANABAZAR SQUARE SUBJOINER
|
||||
|
@ -2770,6 +2807,10 @@ FE20..FE2F ; Mn # [16] COMBINING LIGATURE LEFT HALF..COMBINING CYRILLIC TITL
|
|||
11D3C..11D3D ; Mn # [2] MASARAM GONDI VOWEL SIGN AI..MASARAM GONDI VOWEL SIGN O
|
||||
11D3F..11D45 ; Mn # [7] MASARAM GONDI VOWEL SIGN AU..MASARAM GONDI VIRAMA
|
||||
11D47 ; Mn # MASARAM GONDI RA-KARA
|
||||
11D90..11D91 ; Mn # [2] GUNJALA GONDI VOWEL SIGN EE..GUNJALA GONDI VOWEL SIGN AI
|
||||
11D95 ; Mn # GUNJALA GONDI SIGN ANUSVARA
|
||||
11D97 ; Mn # GUNJALA GONDI VIRAMA
|
||||
11EF3..11EF4 ; Mn # [2] MAKASAR VOWEL SIGN I..MAKASAR VOWEL SIGN U
|
||||
16AF0..16AF4 ; Mn # [5] BASSA VAH COMBINING HIGH TONE..BASSA VAH COMBINING HIGH-LOW TONE
|
||||
16B30..16B36 ; Mn # [7] PAHAWH HMONG MARK CIM TUB..PAHAWH HMONG MARK CIM TAUM
|
||||
16F8F..16F92 ; Mn # [4] MIAO TONE RIGHT..MIAO TONE BELOW
|
||||
|
@ -2794,7 +2835,7 @@ FE20..FE2F ; Mn # [16] COMBINING LIGATURE LEFT HALF..COMBINING CYRILLIC TITL
|
|||
1E944..1E94A ; Mn # [7] ADLAM ALIF LENGTHENER..ADLAM NUKTA
|
||||
E0100..E01EF ; Mn # [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256
|
||||
|
||||
# Total code points: 1763
|
||||
# Total code points: 1805
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -2928,6 +2969,7 @@ ABEC ; Mc # MEETEI MAYEK LUM IYEK
|
|||
110B0..110B2 ; Mc # [3] KAITHI VOWEL SIGN AA..KAITHI VOWEL SIGN II
|
||||
110B7..110B8 ; Mc # [2] KAITHI VOWEL SIGN O..KAITHI VOWEL SIGN AU
|
||||
1112C ; Mc # CHAKMA VOWEL SIGN E
|
||||
11145..11146 ; Mc # [2] CHAKMA VOWEL SIGN AA..CHAKMA VOWEL SIGN EI
|
||||
11182 ; Mc # SHARADA SIGN VISARGA
|
||||
111B3..111B5 ; Mc # [3] SHARADA VOWEL SIGN AA..SHARADA VOWEL SIGN II
|
||||
111BF..111C0 ; Mc # [2] SHARADA VOWEL SIGN AU..SHARADA SIGN VIRAMA
|
||||
|
@ -2960,7 +3002,8 @@ ABEC ; Mc # MEETEI MAYEK LUM IYEK
|
|||
116B6 ; Mc # TAKRI SIGN VIRAMA
|
||||
11720..11721 ; Mc # [2] AHOM VOWEL SIGN A..AHOM VOWEL SIGN AA
|
||||
11726 ; Mc # AHOM VOWEL SIGN E
|
||||
11A07..11A08 ; Mc # [2] ZANABAZAR SQUARE VOWEL SIGN AI..ZANABAZAR SQUARE VOWEL SIGN AU
|
||||
1182C..1182E ; Mc # [3] DOGRA VOWEL SIGN AA..DOGRA VOWEL SIGN II
|
||||
11838 ; Mc # DOGRA SIGN VISARGA
|
||||
11A39 ; Mc # ZANABAZAR SQUARE SIGN VISARGA
|
||||
11A57..11A58 ; Mc # [2] SOYOMBO VOWEL SIGN AI..SOYOMBO VOWEL SIGN AU
|
||||
11A97 ; Mc # SOYOMBO SIGN VISARGA
|
||||
|
@ -2969,11 +3012,15 @@ ABEC ; Mc # MEETEI MAYEK LUM IYEK
|
|||
11CA9 ; Mc # MARCHEN SUBJOINED LETTER YA
|
||||
11CB1 ; Mc # MARCHEN VOWEL SIGN I
|
||||
11CB4 ; Mc # MARCHEN VOWEL SIGN O
|
||||
11D8A..11D8E ; Mc # [5] GUNJALA GONDI VOWEL SIGN AA..GUNJALA GONDI VOWEL SIGN UU
|
||||
11D93..11D94 ; Mc # [2] GUNJALA GONDI VOWEL SIGN OO..GUNJALA GONDI VOWEL SIGN AU
|
||||
11D96 ; Mc # GUNJALA GONDI SIGN VISARGA
|
||||
11EF5..11EF6 ; Mc # [2] MAKASAR VOWEL SIGN E..MAKASAR VOWEL SIGN O
|
||||
16F51..16F7E ; Mc # [46] MIAO SIGN ASPIRATION..MIAO VOWEL SIGN NG
|
||||
1D165..1D166 ; Mc # [2] MUSICAL SYMBOL COMBINING STEM..MUSICAL SYMBOL COMBINING SPRECHGESANG STEM
|
||||
1D16D..1D172 ; Mc # [6] MUSICAL SYMBOL COMBINING AUGMENTATION DOT..MUSICAL SYMBOL COMBINING FLAG-5
|
||||
|
||||
# Total code points: 401
|
||||
# Total code points: 415
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -3017,6 +3064,7 @@ AA50..AA59 ; Nd # [10] CHAM DIGIT ZERO..CHAM DIGIT NINE
|
|||
ABF0..ABF9 ; Nd # [10] MEETEI MAYEK DIGIT ZERO..MEETEI MAYEK DIGIT NINE
|
||||
FF10..FF19 ; Nd # [10] FULLWIDTH DIGIT ZERO..FULLWIDTH DIGIT NINE
|
||||
104A0..104A9 ; Nd # [10] OSMANYA DIGIT ZERO..OSMANYA DIGIT NINE
|
||||
10D30..10D39 ; Nd # [10] HANIFI ROHINGYA DIGIT ZERO..HANIFI ROHINGYA DIGIT NINE
|
||||
11066..1106F ; Nd # [10] BRAHMI DIGIT ZERO..BRAHMI DIGIT NINE
|
||||
110F0..110F9 ; Nd # [10] SORA SOMPENG DIGIT ZERO..SORA SOMPENG DIGIT NINE
|
||||
11136..1113F ; Nd # [10] CHAKMA DIGIT ZERO..CHAKMA DIGIT NINE
|
||||
|
@ -3030,12 +3078,13 @@ FF10..FF19 ; Nd # [10] FULLWIDTH DIGIT ZERO..FULLWIDTH DIGIT NINE
|
|||
118E0..118E9 ; Nd # [10] WARANG CITI DIGIT ZERO..WARANG CITI DIGIT NINE
|
||||
11C50..11C59 ; Nd # [10] BHAIKSUKI DIGIT ZERO..BHAIKSUKI DIGIT NINE
|
||||
11D50..11D59 ; Nd # [10] MASARAM GONDI DIGIT ZERO..MASARAM GONDI DIGIT NINE
|
||||
11DA0..11DA9 ; Nd # [10] GUNJALA GONDI DIGIT ZERO..GUNJALA GONDI DIGIT NINE
|
||||
16A60..16A69 ; Nd # [10] MRO DIGIT ZERO..MRO DIGIT NINE
|
||||
16B50..16B59 ; Nd # [10] PAHAWH HMONG DIGIT ZERO..PAHAWH HMONG DIGIT NINE
|
||||
1D7CE..1D7FF ; Nd # [50] MATHEMATICAL BOLD DIGIT ZERO..MATHEMATICAL MONOSPACE DIGIT NINE
|
||||
1E950..1E959 ; Nd # [10] ADLAM DIGIT ZERO..ADLAM DIGIT NINE
|
||||
|
||||
# Total code points: 590
|
||||
# Total code points: 610
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -3102,7 +3151,7 @@ A830..A835 ; No # [6] NORTH INDIC FRACTION ONE QUARTER..NORTH INDIC FRACTIO
|
|||
109BC..109BD ; No # [2] MEROITIC CURSIVE FRACTION ELEVEN TWELFTHS..MEROITIC CURSIVE FRACTION ONE HALF
|
||||
109C0..109CF ; No # [16] MEROITIC CURSIVE NUMBER ONE..MEROITIC CURSIVE NUMBER SEVENTY
|
||||
109D2..109FF ; No # [46] MEROITIC CURSIVE NUMBER ONE HUNDRED..MEROITIC CURSIVE FRACTION TEN TWELFTHS
|
||||
10A40..10A47 ; No # [8] KHAROSHTHI DIGIT ONE..KHAROSHTHI NUMBER ONE THOUSAND
|
||||
10A40..10A48 ; No # [9] KHAROSHTHI DIGIT ONE..KHAROSHTHI FRACTION ONE HALF
|
||||
10A7D..10A7E ; No # [2] OLD SOUTH ARABIAN NUMBER ONE..OLD SOUTH ARABIAN NUMBER FIFTY
|
||||
10A9D..10A9F ; No # [3] OLD NORTH ARABIAN NUMBER ONE..OLD NORTH ARABIAN NUMBER TWENTY
|
||||
10AEB..10AEF ; No # [5] MANICHAEAN NUMBER ONE..MANICHAEAN NUMBER ONE HUNDRED
|
||||
|
@ -3111,17 +3160,24 @@ A830..A835 ; No # [6] NORTH INDIC FRACTION ONE QUARTER..NORTH INDIC FRACTIO
|
|||
10BA9..10BAF ; No # [7] PSALTER PAHLAVI NUMBER ONE..PSALTER PAHLAVI NUMBER ONE HUNDRED
|
||||
10CFA..10CFF ; No # [6] OLD HUNGARIAN NUMBER ONE..OLD HUNGARIAN NUMBER ONE THOUSAND
|
||||
10E60..10E7E ; No # [31] RUMI DIGIT ONE..RUMI FRACTION TWO THIRDS
|
||||
10F1D..10F26 ; No # [10] OLD SOGDIAN NUMBER ONE..OLD SOGDIAN FRACTION ONE HALF
|
||||
10F51..10F54 ; No # [4] SOGDIAN NUMBER ONE..SOGDIAN NUMBER ONE HUNDRED
|
||||
11052..11065 ; No # [20] BRAHMI NUMBER ONE..BRAHMI NUMBER ONE THOUSAND
|
||||
111E1..111F4 ; No # [20] SINHALA ARCHAIC DIGIT ONE..SINHALA ARCHAIC NUMBER ONE THOUSAND
|
||||
1173A..1173B ; No # [2] AHOM NUMBER TEN..AHOM NUMBER TWENTY
|
||||
118EA..118F2 ; No # [9] WARANG CITI NUMBER TEN..WARANG CITI NUMBER NINETY
|
||||
11C5A..11C6C ; No # [19] BHAIKSUKI NUMBER ONE..BHAIKSUKI HUNDREDS UNIT MARK
|
||||
16B5B..16B61 ; No # [7] PAHAWH HMONG NUMBER TENS..PAHAWH HMONG NUMBER TRILLIONS
|
||||
1D360..1D371 ; No # [18] COUNTING ROD UNIT DIGIT ONE..COUNTING ROD TENS DIGIT NINE
|
||||
16E80..16E96 ; No # [23] MEDEFAIDRIN DIGIT ZERO..MEDEFAIDRIN DIGIT THREE ALTERNATE FORM
|
||||
1D2E0..1D2F3 ; No # [20] MAYAN NUMERAL ZERO..MAYAN NUMERAL NINETEEN
|
||||
1D360..1D378 ; No # [25] COUNTING ROD UNIT DIGIT ONE..TALLY MARK FIVE
|
||||
1E8C7..1E8CF ; No # [9] MENDE KIKAKUI DIGIT ONE..MENDE KIKAKUI DIGIT NINE
|
||||
1EC71..1ECAB ; No # [59] INDIC SIYAQ NUMBER ONE..INDIC SIYAQ NUMBER PREFIXED NINE
|
||||
1ECAD..1ECAF ; No # [3] INDIC SIYAQ FRACTION ONE QUARTER..INDIC SIYAQ FRACTION THREE QUARTERS
|
||||
1ECB1..1ECB4 ; No # [4] INDIC SIYAQ NUMBER ALTERNATE ONE..INDIC SIYAQ ALTERNATE LAKH MARK
|
||||
1F100..1F10C ; No # [13] DIGIT ZERO FULL STOP..DINGBAT NEGATIVE CIRCLED SANS-SERIF DIGIT ZERO
|
||||
|
||||
# Total code points: 676
|
||||
# Total code points: 807
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -3180,12 +3236,13 @@ A830..A835 ; No # [6] NORTH INDIC FRACTION ONE QUARTER..NORTH INDIC FRACTIO
|
|||
FEFF ; Cf # ZERO WIDTH NO-BREAK SPACE
|
||||
FFF9..FFFB ; Cf # [3] INTERLINEAR ANNOTATION ANCHOR..INTERLINEAR ANNOTATION TERMINATOR
|
||||
110BD ; Cf # KAITHI NUMBER SIGN
|
||||
110CD ; Cf # KAITHI NUMBER SIGN ABOVE
|
||||
1BCA0..1BCA3 ; Cf # [4] SHORTHAND FORMAT LETTER OVERLAP..SHORTHAND FORMAT UP STEP
|
||||
1D173..1D17A ; Cf # [8] MUSICAL SYMBOL BEGIN BEAM..MUSICAL SYMBOL END PHRASE
|
||||
E0001 ; Cf # LANGUAGE TAG
|
||||
E0020..E007F ; Cf # [96] TAG SPACE..CANCEL TAG
|
||||
|
||||
# Total code points: 151
|
||||
# Total code points: 152
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -3440,7 +3497,9 @@ FF3F ; Pc # FULLWIDTH LOW LINE
|
|||
0964..0965 ; Po # [2] DEVANAGARI DANDA..DEVANAGARI DOUBLE DANDA
|
||||
0970 ; Po # DEVANAGARI ABBREVIATION SIGN
|
||||
09FD ; Po # BENGALI ABBREVIATION SIGN
|
||||
0A76 ; Po # GURMUKHI ABBREVIATION SIGN
|
||||
0AF0 ; Po # GUJARATI ABBREVIATION SIGN
|
||||
0C84 ; Po # KANNADA SIGN SIDDHAM
|
||||
0DF4 ; Po # SINHALA PUNCTUATION KUNDDALIYA
|
||||
0E4F ; Po # THAI CHARACTER FONGMAN
|
||||
0E5A..0E5B ; Po # [2] THAI CHARACTER ANGKHANKHU..THAI CHARACTER KHOMUT
|
||||
|
@ -3491,7 +3550,7 @@ FF3F ; Pc # FULLWIDTH LOW LINE
|
|||
2E30..2E39 ; Po # [10] RING POINT..TOP HALF SECTION SIGN
|
||||
2E3C..2E3F ; Po # [4] STENOGRAPHIC FULL STOP..CAPITULUM
|
||||
2E41 ; Po # REVERSED COMMA
|
||||
2E43..2E49 ; Po # [7] DASH WITH LEFT UPTURN..DOUBLE STACKED COMMA
|
||||
2E43..2E4E ; Po # [12] DASH WITH LEFT UPTURN..PUNCTUS ELEVATUS MARK
|
||||
3001..3003 ; Po # [3] IDEOGRAPHIC COMMA..DITTO MARK
|
||||
303D ; Po # PART ALTERNATION MARK
|
||||
30FB ; Po # KATAKANA MIDDLE DOT
|
||||
|
@ -3544,12 +3603,13 @@ FF64..FF65 ; Po # [2] HALFWIDTH IDEOGRAPHIC COMMA..HALFWIDTH KATAKANA MIDDL
|
|||
10AF0..10AF6 ; Po # [7] MANICHAEAN PUNCTUATION STAR..MANICHAEAN PUNCTUATION LINE FILLER
|
||||
10B39..10B3F ; Po # [7] AVESTAN ABBREVIATION MARK..LARGE ONE RING OVER TWO RINGS PUNCTUATION
|
||||
10B99..10B9C ; Po # [4] PSALTER PAHLAVI SECTION MARK..PSALTER PAHLAVI FOUR DOTS WITH DOT
|
||||
10F55..10F59 ; Po # [5] SOGDIAN PUNCTUATION TWO VERTICAL BARS..SOGDIAN PUNCTUATION HALF CIRCLE WITH DOT
|
||||
11047..1104D ; Po # [7] BRAHMI DANDA..BRAHMI PUNCTUATION LOTUS
|
||||
110BB..110BC ; Po # [2] KAITHI ABBREVIATION SIGN..KAITHI ENUMERATION SIGN
|
||||
110BE..110C1 ; Po # [4] KAITHI SECTION MARK..KAITHI DOUBLE DANDA
|
||||
11140..11143 ; Po # [4] CHAKMA SECTION MARK..CHAKMA QUESTION MARK
|
||||
11174..11175 ; Po # [2] MAHAJANI ABBREVIATION SIGN..MAHAJANI SECTION MARK
|
||||
111C5..111C9 ; Po # [5] SHARADA DANDA..SHARADA SANDHI MARK
|
||||
111C5..111C8 ; Po # [4] SHARADA DANDA..SHARADA SEPARATOR
|
||||
111CD ; Po # SHARADA SUTRA MARK
|
||||
111DB ; Po # SHARADA SIGN SIDDHAM
|
||||
111DD..111DF ; Po # [3] SHARADA CONTINUATION SIGN..SHARADA SECTION MARK-2
|
||||
|
@ -3563,21 +3623,24 @@ FF64..FF65 ; Po # [2] HALFWIDTH IDEOGRAPHIC COMMA..HALFWIDTH KATAKANA MIDDL
|
|||
11641..11643 ; Po # [3] MODI DANDA..MODI ABBREVIATION SIGN
|
||||
11660..1166C ; Po # [13] MONGOLIAN BIRGA WITH ORNAMENT..MONGOLIAN TURNED SWIRL BIRGA WITH DOUBLE ORNAMENT
|
||||
1173C..1173E ; Po # [3] AHOM SIGN SMALL SECTION..AHOM SIGN RULAI
|
||||
1183B ; Po # DOGRA ABBREVIATION SIGN
|
||||
11A3F..11A46 ; Po # [8] ZANABAZAR SQUARE INITIAL HEAD MARK..ZANABAZAR SQUARE CLOSING DOUBLE-LINED HEAD MARK
|
||||
11A9A..11A9C ; Po # [3] SOYOMBO MARK TSHEG..SOYOMBO MARK DOUBLE SHAD
|
||||
11A9E..11AA2 ; Po # [5] SOYOMBO HEAD MARK WITH MOON AND SUN AND TRIPLE FLAME..SOYOMBO TERMINAL MARK-2
|
||||
11C41..11C45 ; Po # [5] BHAIKSUKI DANDA..BHAIKSUKI GAP FILLER-2
|
||||
11C70..11C71 ; Po # [2] MARCHEN HEAD MARK..MARCHEN MARK SHAD
|
||||
11EF7..11EF8 ; Po # [2] MAKASAR PASSIMBANG..MAKASAR END OF SECTION
|
||||
12470..12474 ; Po # [5] CUNEIFORM PUNCTUATION SIGN OLD ASSYRIAN WORD DIVIDER..CUNEIFORM PUNCTUATION SIGN DIAGONAL QUADCOLON
|
||||
16A6E..16A6F ; Po # [2] MRO DANDA..MRO DOUBLE DANDA
|
||||
16AF5 ; Po # BASSA VAH FULL STOP
|
||||
16B37..16B3B ; Po # [5] PAHAWH HMONG SIGN VOS THOM..PAHAWH HMONG SIGN VOS FEEM
|
||||
16B44 ; Po # PAHAWH HMONG SIGN XAUS
|
||||
16E97..16E9A ; Po # [4] MEDEFAIDRIN COMMA..MEDEFAIDRIN EXCLAMATION OH
|
||||
1BC9F ; Po # DUPLOYAN PUNCTUATION CHINOOK FULL STOP
|
||||
1DA87..1DA8B ; Po # [5] SIGNWRITING COMMA..SIGNWRITING PARENTHESIS
|
||||
1E95E..1E95F ; Po # [2] ADLAM INITIAL EXCLAMATION MARK..ADLAM INITIAL QUESTION MARK
|
||||
|
||||
# Total code points: 566
|
||||
# Total code points: 584
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -3658,6 +3721,7 @@ FFE9..FFEC ; Sm # [4] HALFWIDTH LEFTWARDS ARROW..HALFWIDTH DOWNWARDS ARROW
|
|||
00A2..00A5 ; Sc # [4] CENT SIGN..YEN SIGN
|
||||
058F ; Sc # ARMENIAN DRAM SIGN
|
||||
060B ; Sc # AFGHANI SIGN
|
||||
07FE..07FF ; Sc # [2] NKO DOROME SIGN..NKO TAMAN SIGN
|
||||
09F2..09F3 ; Sc # [2] BENGALI RUPEE MARK..BENGALI RUPEE SIGN
|
||||
09FB ; Sc # BENGALI GANDA MARK
|
||||
0AF1 ; Sc # GUJARATI RUPEE SIGN
|
||||
|
@ -3671,8 +3735,9 @@ FE69 ; Sc # SMALL DOLLAR SIGN
|
|||
FF04 ; Sc # FULLWIDTH DOLLAR SIGN
|
||||
FFE0..FFE1 ; Sc # [2] FULLWIDTH CENT SIGN..FULLWIDTH POUND SIGN
|
||||
FFE5..FFE6 ; Sc # [2] FULLWIDTH YEN SIGN..FULLWIDTH WON SIGN
|
||||
1ECB0 ; Sc # INDIC SIYAQ RUPEE MARK
|
||||
|
||||
# Total code points: 54
|
||||
# Total code points: 57
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -3793,10 +3858,8 @@ FFE3 ; Sk # FULLWIDTH MACRON
|
|||
2B45..2B46 ; So # [2] LEFTWARDS QUADRUPLE ARROW..RIGHTWARDS QUADRUPLE ARROW
|
||||
2B4D..2B73 ; So # [39] DOWNWARDS TRIANGLE-HEADED ZIGZAG ARROW..DOWNWARDS TRIANGLE-HEADED ARROW TO BAR
|
||||
2B76..2B95 ; So # [32] NORTH WEST TRIANGLE-HEADED ARROW TO BAR..RIGHTWARDS BLACK ARROW
|
||||
2B98..2BB9 ; So # [34] THREE-D TOP-LIGHTED LEFTWARDS EQUILATERAL ARROWHEAD..UP ARROWHEAD IN A RECTANGLE BOX
|
||||
2BBD..2BC8 ; So # [12] BALLOT BOX WITH LIGHT X..BLACK MEDIUM RIGHT-POINTING TRIANGLE CENTRED
|
||||
2BCA..2BD2 ; So # [9] TOP HALF BLACK CIRCLE..GROUP MARK
|
||||
2BEC..2BEF ; So # [4] LEFTWARDS TWO-HEADED ARROW WITH TRIANGLE ARROWHEADS..DOWNWARDS TWO-HEADED ARROW WITH TRIANGLE ARROWHEADS
|
||||
2B98..2BC8 ; So # [49] THREE-D TOP-LIGHTED LEFTWARDS EQUILATERAL ARROWHEAD..BLACK MEDIUM RIGHT-POINTING TRIANGLE CENTRED
|
||||
2BCA..2BFE ; So # [53] TOP HALF BLACK CIRCLE..REVERSED RIGHT ANGLE
|
||||
2CE5..2CEA ; So # [6] COPTIC SYMBOL MI RO..COPTIC SYMBOL SHIMA SIMA
|
||||
2E80..2E99 ; So # [26] CJK RADICAL REPEAT..CJK RADICAL RAP
|
||||
2E9B..2EF3 ; So # [89] CJK RADICAL CHOKE..CJK RADICAL C-SIMPLIFIED TURTLE
|
||||
|
@ -3855,14 +3918,14 @@ FFFC..FFFD ; So # [2] OBJECT REPLACEMENT CHARACTER..REPLACEMENT CHARACTER
|
|||
1DA6D..1DA74 ; So # [8] SIGNWRITING SHOULDER HIP SPINE..SIGNWRITING TORSO-FLOORPLANE TWISTING
|
||||
1DA76..1DA83 ; So # [14] SIGNWRITING LIMB COMBINATION..SIGNWRITING LOCATION DEPTH
|
||||
1DA85..1DA86 ; So # [2] SIGNWRITING LOCATION TORSO..SIGNWRITING LOCATION LIMBS DIGITS
|
||||
1ECAC ; So # INDIC SIYAQ PLACEHOLDER
|
||||
1F000..1F02B ; So # [44] MAHJONG TILE EAST WIND..MAHJONG TILE BACK
|
||||
1F030..1F093 ; So # [100] DOMINO TILE HORIZONTAL BACK..DOMINO TILE VERTICAL-06-06
|
||||
1F0A0..1F0AE ; So # [15] PLAYING CARD BACK..PLAYING CARD KING OF SPADES
|
||||
1F0B1..1F0BF ; So # [15] PLAYING CARD ACE OF HEARTS..PLAYING CARD RED JOKER
|
||||
1F0C1..1F0CF ; So # [15] PLAYING CARD ACE OF DIAMONDS..PLAYING CARD BLACK JOKER
|
||||
1F0D1..1F0F5 ; So # [37] PLAYING CARD ACE OF CLUBS..PLAYING CARD TRUMP-21
|
||||
1F110..1F12E ; So # [31] PARENTHESIZED LATIN CAPITAL LETTER A..CIRCLED WZ
|
||||
1F130..1F16B ; So # [60] SQUARED LATIN CAPITAL LETTER A..RAISED MD SIGN
|
||||
1F110..1F16B ; So # [92] PARENTHESIZED LATIN CAPITAL LETTER A..RAISED MD SIGN
|
||||
1F170..1F1AC ; So # [61] NEGATIVE SQUARED LATIN CAPITAL LETTER A..SQUARED VOD
|
||||
1F1E6..1F202 ; So # [29] REGIONAL INDICATOR SYMBOL LETTER A..SQUARED KATAKANA SA
|
||||
1F210..1F23B ; So # [44] SQUARED CJK UNIFIED IDEOGRAPH-624B..SQUARED CJK UNIFIED IDEOGRAPH-914D
|
||||
|
@ -3872,9 +3935,9 @@ FFFC..FFFD ; So # [2] OBJECT REPLACEMENT CHARACTER..REPLACEMENT CHARACTER
|
|||
1F300..1F3FA ; So # [251] CYCLONE..AMPHORA
|
||||
1F400..1F6D4 ; So # [725] RAT..PAGODA
|
||||
1F6E0..1F6EC ; So # [13] HAMMER AND WRENCH..AIRPLANE ARRIVING
|
||||
1F6F0..1F6F8 ; So # [9] SATELLITE..FLYING SAUCER
|
||||
1F6F0..1F6F9 ; So # [10] SATELLITE..SKATEBOARD
|
||||
1F700..1F773 ; So # [116] ALCHEMICAL SYMBOL FOR QUINTESSENCE..ALCHEMICAL SYMBOL FOR HALF OUNCE
|
||||
1F780..1F7D4 ; So # [85] BLACK LEFT-POINTING ISOSCELES RIGHT TRIANGLE..HEAVY TWELVE POINTED PINWHEEL STAR
|
||||
1F780..1F7D8 ; So # [89] BLACK LEFT-POINTING ISOSCELES RIGHT TRIANGLE..NEGATIVE CIRCLED SQUARE
|
||||
1F800..1F80B ; So # [12] LEFTWARDS ARROW WITH SMALL TRIANGLE ARROWHEAD..DOWNWARDS ARROW WITH LARGE TRIANGLE ARROWHEAD
|
||||
1F810..1F847 ; So # [56] LEFTWARDS ARROW WITH SMALL EQUILATERAL ARROWHEAD..DOWNWARDS HEAVY ARROW
|
||||
1F850..1F859 ; So # [10] LEFTWARDS SANS-SERIF ARROW..UP DOWN SANS-SERIF ARROW
|
||||
|
@ -3882,13 +3945,16 @@ FFFC..FFFD ; So # [2] OBJECT REPLACEMENT CHARACTER..REPLACEMENT CHARACTER
|
|||
1F890..1F8AD ; So # [30] LEFTWARDS TRIANGLE ARROWHEAD..WHITE ARROW SHAFT WIDTH TWO THIRDS
|
||||
1F900..1F90B ; So # [12] CIRCLED CROSS FORMEE WITH FOUR DOTS..DOWNWARD FACING NOTCHED HOOK WITH DOT
|
||||
1F910..1F93E ; So # [47] ZIPPER-MOUTH FACE..HANDBALL
|
||||
1F940..1F94C ; So # [13] WILTED FLOWER..CURLING STONE
|
||||
1F950..1F96B ; So # [28] CROISSANT..CANNED FOOD
|
||||
1F980..1F997 ; So # [24] CRAB..CRICKET
|
||||
1F9C0 ; So # CHEESE WEDGE
|
||||
1F9D0..1F9E6 ; So # [23] FACE WITH MONOCLE..SOCKS
|
||||
1F940..1F970 ; So # [49] WILTED FLOWER..SMILING FACE WITH SMILING EYES AND THREE HEARTS
|
||||
1F973..1F976 ; So # [4] FACE WITH PARTY HORN AND PARTY HAT..FREEZING FACE
|
||||
1F97A ; So # FACE WITH PLEADING EYES
|
||||
1F97C..1F9A2 ; So # [39] LAB COAT..SWAN
|
||||
1F9B0..1F9B9 ; So # [10] EMOJI COMPONENT RED HAIR..SUPERVILLAIN
|
||||
1F9C0..1F9C2 ; So # [3] CHEESE WEDGE..SALT SHAKER
|
||||
1F9D0..1F9FF ; So # [48] FACE WITH MONOCLE..NAZAR AMULET
|
||||
1FA60..1FA6D ; So # [14] XIANGQI RED GENERAL..XIANGQI BLACK SOLDIER
|
||||
|
||||
# Total code points: 5855
|
||||
# Total code points: 5984
|
||||
|
||||
# ================================================
|
||||
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
# GraphemeBreakProperty-10.0.0.txt
|
||||
# Date: 2017-03-12, 07:03:41 GMT
|
||||
# © 2017 Unicode®, Inc.
|
||||
# GraphemeBreakProperty-11.0.0.txt
|
||||
# Date: 2018-03-16, 20:34:02 GMT
|
||||
# © 2018 Unicode®, Inc.
|
||||
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
|
||||
# For terms of use, see http://www.unicode.org/terms_of_use.html
|
||||
#
|
||||
|
@ -24,12 +24,13 @@
|
|||
08E2 ; Prepend # Cf ARABIC DISPUTED END OF AYAH
|
||||
0D4E ; Prepend # Lo MALAYALAM LETTER DOT REPH
|
||||
110BD ; Prepend # Cf KAITHI NUMBER SIGN
|
||||
110CD ; Prepend # Cf KAITHI NUMBER SIGN ABOVE
|
||||
111C2..111C3 ; Prepend # Lo [2] SHARADA SIGN JIHVAMULIYA..SHARADA SIGN UPADHMANIYA
|
||||
11A3A ; Prepend # Lo ZANABAZAR SQUARE CLUSTER-INITIAL LETTER RA
|
||||
11A86..11A89 ; Prepend # Lo [4] SOYOMBO CLUSTER-INITIAL LETTER RA..SOYOMBO CLUSTER-INITIAL LETTER SA
|
||||
11D46 ; Prepend # Lo MASARAM GONDI REPHA
|
||||
|
||||
# Total code points: 19
|
||||
# Total code points: 20
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -95,12 +96,13 @@ E01F0..E0FFF ; Control # Cn [3600] <reserved-E01F0>..<reserved-E0FFF>
|
|||
0730..074A ; Extend # Mn [27] SYRIAC PTHAHA ABOVE..SYRIAC BARREKH
|
||||
07A6..07B0 ; Extend # Mn [11] THAANA ABAFILI..THAANA SUKUN
|
||||
07EB..07F3 ; Extend # Mn [9] NKO COMBINING SHORT HIGH TONE..NKO COMBINING DOUBLE DOT ABOVE
|
||||
07FD ; Extend # Mn NKO DANTAYALAN
|
||||
0816..0819 ; Extend # Mn [4] SAMARITAN MARK IN..SAMARITAN MARK DAGESH
|
||||
081B..0823 ; Extend # Mn [9] SAMARITAN MARK EPENTHETIC YUT..SAMARITAN VOWEL SIGN A
|
||||
0825..0827 ; Extend # Mn [3] SAMARITAN VOWEL SIGN SHORT A..SAMARITAN VOWEL SIGN U
|
||||
0829..082D ; Extend # Mn [5] SAMARITAN VOWEL SIGN LONG I..SAMARITAN MARK NEQUDAA
|
||||
0859..085B ; Extend # Mn [3] MANDAIC AFFRICATION MARK..MANDAIC GEMINATION MARK
|
||||
08D4..08E1 ; Extend # Mn [14] ARABIC SMALL HIGH WORD AR-RUB..ARABIC SMALL HIGH SIGN SAFHA
|
||||
08D3..08E1 ; Extend # Mn [15] ARABIC SMALL LOW WAW..ARABIC SMALL HIGH SIGN SAFHA
|
||||
08E3..0902 ; Extend # Mn [32] ARABIC TURNED DAMMA BELOW..DEVANAGARI SIGN ANUSVARA
|
||||
093A ; Extend # Mn DEVANAGARI VOWEL SIGN OE
|
||||
093C ; Extend # Mn DEVANAGARI SIGN NUKTA
|
||||
|
@ -115,6 +117,7 @@ E01F0..E0FFF ; Control # Cn [3600] <reserved-E01F0>..<reserved-E0FFF>
|
|||
09CD ; Extend # Mn BENGALI SIGN VIRAMA
|
||||
09D7 ; Extend # Mc BENGALI AU LENGTH MARK
|
||||
09E2..09E3 ; Extend # Mn [2] BENGALI VOWEL SIGN VOCALIC L..BENGALI VOWEL SIGN VOCALIC LL
|
||||
09FE ; Extend # Mn BENGALI SANDHI MARK
|
||||
0A01..0A02 ; Extend # Mn [2] GURMUKHI SIGN ADAK BINDI..GURMUKHI SIGN BINDI
|
||||
0A3C ; Extend # Mn GURMUKHI SIGN NUKTA
|
||||
0A41..0A42 ; Extend # Mn [2] GURMUKHI VOWEL SIGN U..GURMUKHI VOWEL SIGN UU
|
||||
|
@ -145,6 +148,7 @@ E01F0..E0FFF ; Control # Cn [3600] <reserved-E01F0>..<reserved-E0FFF>
|
|||
0BCD ; Extend # Mn TAMIL SIGN VIRAMA
|
||||
0BD7 ; Extend # Mc TAMIL AU LENGTH MARK
|
||||
0C00 ; Extend # Mn TELUGU SIGN COMBINING CANDRABINDU ABOVE
|
||||
0C04 ; Extend # Mn TELUGU SIGN COMBINING ANUSVARA ABOVE
|
||||
0C3E..0C40 ; Extend # Mn [3] TELUGU VOWEL SIGN AA..TELUGU VOWEL SIGN II
|
||||
0C46..0C48 ; Extend # Mn [3] TELUGU VOWEL SIGN E..TELUGU VOWEL SIGN AI
|
||||
0C4A..0C4D ; Extend # Mn [4] TELUGU VOWEL SIGN O..TELUGU SIGN VIRAMA
|
||||
|
@ -273,6 +277,7 @@ A80B ; Extend # Mn SYLOTI NAGRI SIGN ANUSVARA
|
|||
A825..A826 ; Extend # Mn [2] SYLOTI NAGRI VOWEL SIGN U..SYLOTI NAGRI VOWEL SIGN E
|
||||
A8C4..A8C5 ; Extend # Mn [2] SAURASHTRA SIGN VIRAMA..SAURASHTRA SIGN CANDRABINDU
|
||||
A8E0..A8F1 ; Extend # Mn [18] COMBINING DEVANAGARI DIGIT ZERO..COMBINING DEVANAGARI SIGN AVAGRAHA
|
||||
A8FF ; Extend # Mn DEVANAGARI VOWEL SIGN AY
|
||||
A926..A92D ; Extend # Mn [8] KAYAH LI VOWEL UE..KAYAH LI TONE CALYA PLOPHU
|
||||
A947..A951 ; Extend # Mn [11] REJANG VOWEL SIGN I..REJANG CONSONANT SIGN R
|
||||
A980..A982 ; Extend # Mn [3] JAVANESE SIGN PANYANGGA..JAVANESE SIGN LAYAR
|
||||
|
@ -309,6 +314,8 @@ FF9E..FF9F ; Extend # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDT
|
|||
10A38..10A3A ; Extend # Mn [3] KHAROSHTHI SIGN BAR ABOVE..KHAROSHTHI SIGN DOT BELOW
|
||||
10A3F ; Extend # Mn KHAROSHTHI VIRAMA
|
||||
10AE5..10AE6 ; Extend # Mn [2] MANICHAEAN ABBREVIATION MARK ABOVE..MANICHAEAN ABBREVIATION MARK BELOW
|
||||
10D24..10D27 ; Extend # Mn [4] HANIFI ROHINGYA SIGN HARBAHAY..HANIFI ROHINGYA SIGN TASSI
|
||||
10F46..10F50 ; Extend # Mn [11] SOGDIAN COMBINING DOT BELOW..SOGDIAN COMBINING STROKE BELOW
|
||||
11001 ; Extend # Mn BRAHMI SIGN ANUSVARA
|
||||
11038..11046 ; Extend # Mn [15] BRAHMI VOWEL SIGN AA..BRAHMI VIRAMA
|
||||
1107F..11081 ; Extend # Mn [3] BRAHMI NUMBER JOINER..KAITHI SIGN ANUSVARA
|
||||
|
@ -320,7 +327,7 @@ FF9E..FF9F ; Extend # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDT
|
|||
11173 ; Extend # Mn MAHAJANI SIGN NUKTA
|
||||
11180..11181 ; Extend # Mn [2] SHARADA SIGN CANDRABINDU..SHARADA SIGN ANUSVARA
|
||||
111B6..111BE ; Extend # Mn [9] SHARADA VOWEL SIGN U..SHARADA VOWEL SIGN O
|
||||
111CA..111CC ; Extend # Mn [3] SHARADA SIGN NUKTA..SHARADA EXTRA SHORT VOWEL MARK
|
||||
111C9..111CC ; Extend # Mn [4] SHARADA SANDHI MARK..SHARADA EXTRA SHORT VOWEL MARK
|
||||
1122F..11231 ; Extend # Mn [3] KHOJKI VOWEL SIGN U..KHOJKI VOWEL SIGN AI
|
||||
11234 ; Extend # Mn KHOJKI SIGN ANUSVARA
|
||||
11236..11237 ; Extend # Mn [2] KHOJKI SIGN NUKTA..KHOJKI SIGN SHADDA
|
||||
|
@ -328,7 +335,7 @@ FF9E..FF9F ; Extend # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDT
|
|||
112DF ; Extend # Mn KHUDAWADI SIGN ANUSVARA
|
||||
112E3..112EA ; Extend # Mn [8] KHUDAWADI VOWEL SIGN U..KHUDAWADI SIGN VIRAMA
|
||||
11300..11301 ; Extend # Mn [2] GRANTHA SIGN COMBINING ANUSVARA ABOVE..GRANTHA SIGN CANDRABINDU
|
||||
1133C ; Extend # Mn GRANTHA SIGN NUKTA
|
||||
1133B..1133C ; Extend # Mn [2] COMBINING BINDU BELOW..GRANTHA SIGN NUKTA
|
||||
1133E ; Extend # Mc GRANTHA VOWEL SIGN AA
|
||||
11340 ; Extend # Mn GRANTHA VOWEL SIGN II
|
||||
11357 ; Extend # Mc GRANTHA AU LENGTH MARK
|
||||
|
@ -337,6 +344,7 @@ FF9E..FF9F ; Extend # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDT
|
|||
11438..1143F ; Extend # Mn [8] NEWA VOWEL SIGN U..NEWA VOWEL SIGN AI
|
||||
11442..11444 ; Extend # Mn [3] NEWA SIGN VIRAMA..NEWA SIGN ANUSVARA
|
||||
11446 ; Extend # Mn NEWA SIGN NUKTA
|
||||
1145E ; Extend # Mn NEWA SANDHI MARK
|
||||
114B0 ; Extend # Mc TIRHUTA VOWEL SIGN AA
|
||||
114B3..114B8 ; Extend # Mn [6] TIRHUTA VOWEL SIGN U..TIRHUTA VOWEL SIGN VOCALIC LL
|
||||
114BA ; Extend # Mn TIRHUTA VOWEL SIGN SHORT E
|
||||
|
@ -358,8 +366,9 @@ FF9E..FF9F ; Extend # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDT
|
|||
1171D..1171F ; Extend # Mn [3] AHOM CONSONANT SIGN MEDIAL LA..AHOM CONSONANT SIGN MEDIAL LIGATING RA
|
||||
11722..11725 ; Extend # Mn [4] AHOM VOWEL SIGN I..AHOM VOWEL SIGN UU
|
||||
11727..1172B ; Extend # Mn [5] AHOM VOWEL SIGN AW..AHOM SIGN KILLER
|
||||
11A01..11A06 ; Extend # Mn [6] ZANABAZAR SQUARE VOWEL SIGN I..ZANABAZAR SQUARE VOWEL SIGN O
|
||||
11A09..11A0A ; Extend # Mn [2] ZANABAZAR SQUARE VOWEL SIGN REVERSED I..ZANABAZAR SQUARE VOWEL LENGTH MARK
|
||||
1182F..11837 ; Extend # Mn [9] DOGRA VOWEL SIGN U..DOGRA SIGN ANUSVARA
|
||||
11839..1183A ; Extend # Mn [2] DOGRA SIGN VIRAMA..DOGRA SIGN NUKTA
|
||||
11A01..11A0A ; Extend # Mn [10] ZANABAZAR SQUARE VOWEL SIGN I..ZANABAZAR SQUARE VOWEL LENGTH MARK
|
||||
11A33..11A38 ; Extend # Mn [6] ZANABAZAR SQUARE FINAL CONSONANT MARK..ZANABAZAR SQUARE SIGN ANUSVARA
|
||||
11A3B..11A3E ; Extend # Mn [4] ZANABAZAR SQUARE CLUSTER-FINAL LETTER YA..ZANABAZAR SQUARE CLUSTER-FINAL LETTER VA
|
||||
11A47 ; Extend # Mn ZANABAZAR SQUARE SUBJOINER
|
||||
|
@ -379,6 +388,10 @@ FF9E..FF9F ; Extend # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDT
|
|||
11D3C..11D3D ; Extend # Mn [2] MASARAM GONDI VOWEL SIGN AI..MASARAM GONDI VOWEL SIGN O
|
||||
11D3F..11D45 ; Extend # Mn [7] MASARAM GONDI VOWEL SIGN AU..MASARAM GONDI VIRAMA
|
||||
11D47 ; Extend # Mn MASARAM GONDI RA-KARA
|
||||
11D90..11D91 ; Extend # Mn [2] GUNJALA GONDI VOWEL SIGN EE..GUNJALA GONDI VOWEL SIGN AI
|
||||
11D95 ; Extend # Mn GUNJALA GONDI SIGN ANUSVARA
|
||||
11D97 ; Extend # Mn GUNJALA GONDI VIRAMA
|
||||
11EF3..11EF4 ; Extend # Mn [2] MAKASAR VOWEL SIGN I..MAKASAR VOWEL SIGN U
|
||||
16AF0..16AF4 ; Extend # Mn [5] BASSA VAH COMBINING HIGH TONE..BASSA VAH COMBINING HIGH-LOW TONE
|
||||
16B30..16B36 ; Extend # Mn [7] PAHAWH HMONG MARK CIM TUB..PAHAWH HMONG MARK CIM TAUM
|
||||
16F8F..16F92 ; Extend # Mn [4] MIAO TONE RIGHT..MIAO TONE BELOW
|
||||
|
@ -403,10 +416,11 @@ FF9E..FF9F ; Extend # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDT
|
|||
1E026..1E02A ; Extend # Mn [5] COMBINING GLAGOLITIC LETTER YO..COMBINING GLAGOLITIC LETTER FITA
|
||||
1E8D0..1E8D6 ; Extend # Mn [7] MENDE KIKAKUI COMBINING NUMBER TEENS..MENDE KIKAKUI COMBINING NUMBER MILLIONS
|
||||
1E944..1E94A ; Extend # Mn [7] ADLAM ALIF LENGTHENER..ADLAM NUKTA
|
||||
1F3FB..1F3FF ; Extend # Sk [5] EMOJI MODIFIER FITZPATRICK TYPE-1-2..EMOJI MODIFIER FITZPATRICK TYPE-6
|
||||
E0020..E007F ; Extend # Cf [96] TAG SPACE..CANCEL TAG
|
||||
E0100..E01EF ; Extend # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256
|
||||
|
||||
# Total code points: 1901
|
||||
# Total code points: 1948
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -517,6 +531,7 @@ ABEC ; SpacingMark # Mc MEETEI MAYEK LUM IYEK
|
|||
110B0..110B2 ; SpacingMark # Mc [3] KAITHI VOWEL SIGN AA..KAITHI VOWEL SIGN II
|
||||
110B7..110B8 ; SpacingMark # Mc [2] KAITHI VOWEL SIGN O..KAITHI VOWEL SIGN AU
|
||||
1112C ; SpacingMark # Mc CHAKMA VOWEL SIGN E
|
||||
11145..11146 ; SpacingMark # Mc [2] CHAKMA VOWEL SIGN AA..CHAKMA VOWEL SIGN EI
|
||||
11182 ; SpacingMark # Mc SHARADA SIGN VISARGA
|
||||
111B3..111B5 ; SpacingMark # Mc [3] SHARADA VOWEL SIGN AA..SHARADA VOWEL SIGN II
|
||||
111BF..111C0 ; SpacingMark # Mc [2] SHARADA VOWEL SIGN AU..SHARADA SIGN VIRAMA
|
||||
|
@ -549,7 +564,8 @@ ABEC ; SpacingMark # Mc MEETEI MAYEK LUM IYEK
|
|||
116B6 ; SpacingMark # Mc TAKRI SIGN VIRAMA
|
||||
11720..11721 ; SpacingMark # Mc [2] AHOM VOWEL SIGN A..AHOM VOWEL SIGN AA
|
||||
11726 ; SpacingMark # Mc AHOM VOWEL SIGN E
|
||||
11A07..11A08 ; SpacingMark # Mc [2] ZANABAZAR SQUARE VOWEL SIGN AI..ZANABAZAR SQUARE VOWEL SIGN AU
|
||||
1182C..1182E ; SpacingMark # Mc [3] DOGRA VOWEL SIGN AA..DOGRA VOWEL SIGN II
|
||||
11838 ; SpacingMark # Mc DOGRA SIGN VISARGA
|
||||
11A39 ; SpacingMark # Mc ZANABAZAR SQUARE SIGN VISARGA
|
||||
11A57..11A58 ; SpacingMark # Mc [2] SOYOMBO VOWEL SIGN AI..SOYOMBO VOWEL SIGN AU
|
||||
11A97 ; SpacingMark # Mc SOYOMBO SIGN VISARGA
|
||||
|
@ -558,11 +574,15 @@ ABEC ; SpacingMark # Mc MEETEI MAYEK LUM IYEK
|
|||
11CA9 ; SpacingMark # Mc MARCHEN SUBJOINED LETTER YA
|
||||
11CB1 ; SpacingMark # Mc MARCHEN VOWEL SIGN I
|
||||
11CB4 ; SpacingMark # Mc MARCHEN VOWEL SIGN O
|
||||
11D8A..11D8E ; SpacingMark # Mc [5] GUNJALA GONDI VOWEL SIGN AA..GUNJALA GONDI VOWEL SIGN UU
|
||||
11D93..11D94 ; SpacingMark # Mc [2] GUNJALA GONDI VOWEL SIGN OO..GUNJALA GONDI VOWEL SIGN AU
|
||||
11D96 ; SpacingMark # Mc GUNJALA GONDI SIGN VISARGA
|
||||
11EF5..11EF6 ; SpacingMark # Mc [2] MAKASAR VOWEL SIGN E..MAKASAR VOWEL SIGN O
|
||||
16F51..16F7E ; SpacingMark # Mc [46] MIAO SIGN ASPIRATION..MIAO VOWEL SIGN NG
|
||||
1D166 ; SpacingMark # Mc MUSICAL SYMBOL COMBINING SPRECHGESANG STEM
|
||||
1D16D ; SpacingMark # Mc MUSICAL SYMBOL COMBINING AUGMENTATION DOT
|
||||
|
||||
# Total code points: 348
|
||||
# Total code points: 362
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -1395,81 +1415,8 @@ D789..D7A3 ; LVT # Lo [27] HANGUL SYLLABLE HIG..HANGUL SYLLABLE HIH
|
|||
|
||||
# ================================================
|
||||
|
||||
261D ; E_Base # So WHITE UP POINTING INDEX
|
||||
26F9 ; E_Base # So PERSON WITH BALL
|
||||
270A..270D ; E_Base # So [4] RAISED FIST..WRITING HAND
|
||||
1F385 ; E_Base # So FATHER CHRISTMAS
|
||||
1F3C2..1F3C4 ; E_Base # So [3] SNOWBOARDER..SURFER
|
||||
1F3C7 ; E_Base # So HORSE RACING
|
||||
1F3CA..1F3CC ; E_Base # So [3] SWIMMER..GOLFER
|
||||
1F442..1F443 ; E_Base # So [2] EAR..NOSE
|
||||
1F446..1F450 ; E_Base # So [11] WHITE UP POINTING BACKHAND INDEX..OPEN HANDS SIGN
|
||||
1F46E ; E_Base # So POLICE OFFICER
|
||||
1F470..1F478 ; E_Base # So [9] BRIDE WITH VEIL..PRINCESS
|
||||
1F47C ; E_Base # So BABY ANGEL
|
||||
1F481..1F483 ; E_Base # So [3] INFORMATION DESK PERSON..DANCER
|
||||
1F485..1F487 ; E_Base # So [3] NAIL POLISH..HAIRCUT
|
||||
1F4AA ; E_Base # So FLEXED BICEPS
|
||||
1F574..1F575 ; E_Base # So [2] MAN IN BUSINESS SUIT LEVITATING..SLEUTH OR SPY
|
||||
1F57A ; E_Base # So MAN DANCING
|
||||
1F590 ; E_Base # So RAISED HAND WITH FINGERS SPLAYED
|
||||
1F595..1F596 ; E_Base # So [2] REVERSED HAND WITH MIDDLE FINGER EXTENDED..RAISED HAND WITH PART BETWEEN MIDDLE AND RING FINGERS
|
||||
1F645..1F647 ; E_Base # So [3] FACE WITH NO GOOD GESTURE..PERSON BOWING DEEPLY
|
||||
1F64B..1F64F ; E_Base # So [5] HAPPY PERSON RAISING ONE HAND..PERSON WITH FOLDED HANDS
|
||||
1F6A3 ; E_Base # So ROWBOAT
|
||||
1F6B4..1F6B6 ; E_Base # So [3] BICYCLIST..PEDESTRIAN
|
||||
1F6C0 ; E_Base # So BATH
|
||||
1F6CC ; E_Base # So SLEEPING ACCOMMODATION
|
||||
1F918..1F91C ; E_Base # So [5] SIGN OF THE HORNS..RIGHT-FACING FIST
|
||||
1F91E..1F91F ; E_Base # So [2] HAND WITH INDEX AND MIDDLE FINGERS CROSSED..I LOVE YOU HAND SIGN
|
||||
1F926 ; E_Base # So FACE PALM
|
||||
1F930..1F939 ; E_Base # So [10] PREGNANT WOMAN..JUGGLING
|
||||
1F93D..1F93E ; E_Base # So [2] WATER POLO..HANDBALL
|
||||
1F9D1..1F9DD ; E_Base # So [13] ADULT..ELF
|
||||
|
||||
# Total code points: 98
|
||||
|
||||
# ================================================
|
||||
|
||||
1F3FB..1F3FF ; E_Modifier # Sk [5] EMOJI MODIFIER FITZPATRICK TYPE-1-2..EMOJI MODIFIER FITZPATRICK TYPE-6
|
||||
|
||||
# Total code points: 5
|
||||
|
||||
# ================================================
|
||||
|
||||
200D ; ZWJ # Cf ZERO WIDTH JOINER
|
||||
|
||||
# Total code points: 1
|
||||
|
||||
# ================================================
|
||||
|
||||
2640 ; Glue_After_Zwj # So FEMALE SIGN
|
||||
2642 ; Glue_After_Zwj # So MALE SIGN
|
||||
2695..2696 ; Glue_After_Zwj # So [2] STAFF OF AESCULAPIUS..SCALES
|
||||
2708 ; Glue_After_Zwj # So AIRPLANE
|
||||
2764 ; Glue_After_Zwj # So HEAVY BLACK HEART
|
||||
1F308 ; Glue_After_Zwj # So RAINBOW
|
||||
1F33E ; Glue_After_Zwj # So EAR OF RICE
|
||||
1F373 ; Glue_After_Zwj # So COOKING
|
||||
1F393 ; Glue_After_Zwj # So GRADUATION CAP
|
||||
1F3A4 ; Glue_After_Zwj # So MICROPHONE
|
||||
1F3A8 ; Glue_After_Zwj # So ARTIST PALETTE
|
||||
1F3EB ; Glue_After_Zwj # So SCHOOL
|
||||
1F3ED ; Glue_After_Zwj # So FACTORY
|
||||
1F48B ; Glue_After_Zwj # So KISS MARK
|
||||
1F4BB..1F4BC ; Glue_After_Zwj # So [2] PERSONAL COMPUTER..BRIEFCASE
|
||||
1F527 ; Glue_After_Zwj # So WRENCH
|
||||
1F52C ; Glue_After_Zwj # So MICROSCOPE
|
||||
1F5E8 ; Glue_After_Zwj # So LEFT SPEECH BUBBLE
|
||||
1F680 ; Glue_After_Zwj # So ROCKET
|
||||
1F692 ; Glue_After_Zwj # So FIRE ENGINE
|
||||
|
||||
# Total code points: 22
|
||||
|
||||
# ================================================
|
||||
|
||||
1F466..1F469 ; E_Base_GAZ # So [4] BOY..WOMAN
|
||||
|
||||
# Total code points: 4
|
||||
|
||||
# EOF
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
# Scripts-10.0.0.txt
|
||||
# Date: 2017-03-11, 06:40:37 GMT
|
||||
# © 2017 Unicode®, Inc.
|
||||
# Scripts-11.0.0.txt
|
||||
# Date: 2018-02-21, 05:34:31 GMT
|
||||
# © 2018 Unicode®, Inc.
|
||||
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
|
||||
# For terms of use, see http://www.unicode.org/terms_of_use.html
|
||||
#
|
||||
|
@ -308,10 +308,8 @@
|
|||
2B47..2B4C ; Common # Sm [6] REVERSE TILDE OPERATOR ABOVE RIGHTWARDS ARROW..RIGHTWARDS ARROW ABOVE REVERSE TILDE OPERATOR
|
||||
2B4D..2B73 ; Common # So [39] DOWNWARDS TRIANGLE-HEADED ZIGZAG ARROW..DOWNWARDS TRIANGLE-HEADED ARROW TO BAR
|
||||
2B76..2B95 ; Common # So [32] NORTH WEST TRIANGLE-HEADED ARROW TO BAR..RIGHTWARDS BLACK ARROW
|
||||
2B98..2BB9 ; Common # So [34] THREE-D TOP-LIGHTED LEFTWARDS EQUILATERAL ARROWHEAD..UP ARROWHEAD IN A RECTANGLE BOX
|
||||
2BBD..2BC8 ; Common # So [12] BALLOT BOX WITH LIGHT X..BLACK MEDIUM RIGHT-POINTING TRIANGLE CENTRED
|
||||
2BCA..2BD2 ; Common # So [9] TOP HALF BLACK CIRCLE..GROUP MARK
|
||||
2BEC..2BEF ; Common # So [4] LEFTWARDS TWO-HEADED ARROW WITH TRIANGLE ARROWHEADS..DOWNWARDS TWO-HEADED ARROW WITH TRIANGLE ARROWHEADS
|
||||
2B98..2BC8 ; Common # So [49] THREE-D TOP-LIGHTED LEFTWARDS EQUILATERAL ARROWHEAD..BLACK MEDIUM RIGHT-POINTING TRIANGLE CENTRED
|
||||
2BCA..2BFE ; Common # So [53] TOP HALF BLACK CIRCLE..REVERSED RIGHT ANGLE
|
||||
2E00..2E01 ; Common # Po [2] RIGHT ANGLE SUBSTITUTION MARKER..RIGHT ANGLE DOTTED SUBSTITUTION MARKER
|
||||
2E02 ; Common # Pi LEFT SUBSTITUTION BRACKET
|
||||
2E03 ; Common # Pf RIGHT SUBSTITUTION BRACKET
|
||||
|
@ -349,7 +347,7 @@
|
|||
2E40 ; Common # Pd DOUBLE HYPHEN
|
||||
2E41 ; Common # Po REVERSED COMMA
|
||||
2E42 ; Common # Ps DOUBLE LOW-REVERSED-9 QUOTATION MARK
|
||||
2E43..2E49 ; Common # Po [7] DASH WITH LEFT UPTURN..DOUBLE STACKED COMMA
|
||||
2E43..2E4E ; Common # Po [12] DASH WITH LEFT UPTURN..PUNCTUS ELEVATUS MARK
|
||||
2FF0..2FFB ; Common # So [12] IDEOGRAPHIC DESCRIPTION CHARACTER LEFT TO RIGHT..IDEOGRAPHIC DESCRIPTION CHARACTER OVERLAID
|
||||
3000 ; Common # Zs IDEOGRAPHIC SPACE
|
||||
3001..3003 ; Common # Po [3] IDEOGRAPHIC COMMA..DITTO MARK
|
||||
|
@ -522,8 +520,9 @@ FFFC..FFFD ; Common # So [2] OBJECT REPLACEMENT CHARACTER..REPLACEMENT CHAR
|
|||
1D183..1D184 ; Common # So [2] MUSICAL SYMBOL ARPEGGIATO UP..MUSICAL SYMBOL ARPEGGIATO DOWN
|
||||
1D18C..1D1A9 ; Common # So [30] MUSICAL SYMBOL RINFORZANDO..MUSICAL SYMBOL DEGREE SLASH
|
||||
1D1AE..1D1E8 ; Common # So [59] MUSICAL SYMBOL PEDAL MARK..MUSICAL SYMBOL KIEVAN FLAT SIGN
|
||||
1D2E0..1D2F3 ; Common # No [20] MAYAN NUMERAL ZERO..MAYAN NUMERAL NINETEEN
|
||||
1D300..1D356 ; Common # So [87] MONOGRAM FOR EARTH..TETRAGRAM FOR FOSTERING
|
||||
1D360..1D371 ; Common # No [18] COUNTING ROD UNIT DIGIT ONE..COUNTING ROD TENS DIGIT NINE
|
||||
1D360..1D378 ; Common # No [25] COUNTING ROD UNIT DIGIT ONE..TALLY MARK FIVE
|
||||
1D400..1D454 ; Common # L& [85] MATHEMATICAL BOLD CAPITAL A..MATHEMATICAL ITALIC SMALL G
|
||||
1D456..1D49C ; Common # L& [71] MATHEMATICAL ITALIC SMALL I..MATHEMATICAL SCRIPT CAPITAL A
|
||||
1D49E..1D49F ; Common # L& [2] MATHEMATICAL SCRIPT CAPITAL C..MATHEMATICAL SCRIPT CAPITAL D
|
||||
|
@ -565,6 +564,11 @@ FFFC..FFFD ; Common # So [2] OBJECT REPLACEMENT CHARACTER..REPLACEMENT CHAR
|
|||
1D7C3 ; Common # Sm MATHEMATICAL SANS-SERIF BOLD ITALIC PARTIAL DIFFERENTIAL
|
||||
1D7C4..1D7CB ; Common # L& [8] MATHEMATICAL SANS-SERIF BOLD ITALIC EPSILON SYMBOL..MATHEMATICAL BOLD SMALL DIGAMMA
|
||||
1D7CE..1D7FF ; Common # Nd [50] MATHEMATICAL BOLD DIGIT ZERO..MATHEMATICAL MONOSPACE DIGIT NINE
|
||||
1EC71..1ECAB ; Common # No [59] INDIC SIYAQ NUMBER ONE..INDIC SIYAQ NUMBER PREFIXED NINE
|
||||
1ECAC ; Common # So INDIC SIYAQ PLACEHOLDER
|
||||
1ECAD..1ECAF ; Common # No [3] INDIC SIYAQ FRACTION ONE QUARTER..INDIC SIYAQ FRACTION THREE QUARTERS
|
||||
1ECB0 ; Common # Sc INDIC SIYAQ RUPEE MARK
|
||||
1ECB1..1ECB4 ; Common # No [4] INDIC SIYAQ NUMBER ALTERNATE ONE..INDIC SIYAQ ALTERNATE LAKH MARK
|
||||
1F000..1F02B ; Common # So [44] MAHJONG TILE EAST WIND..MAHJONG TILE BACK
|
||||
1F030..1F093 ; Common # So [100] DOMINO TILE HORIZONTAL BACK..DOMINO TILE VERTICAL-06-06
|
||||
1F0A0..1F0AE ; Common # So [15] PLAYING CARD BACK..PLAYING CARD KING OF SPADES
|
||||
|
@ -572,8 +576,7 @@ FFFC..FFFD ; Common # So [2] OBJECT REPLACEMENT CHARACTER..REPLACEMENT CHAR
|
|||
1F0C1..1F0CF ; Common # So [15] PLAYING CARD ACE OF DIAMONDS..PLAYING CARD BLACK JOKER
|
||||
1F0D1..1F0F5 ; Common # So [37] PLAYING CARD ACE OF CLUBS..PLAYING CARD TRUMP-21
|
||||
1F100..1F10C ; Common # No [13] DIGIT ZERO FULL STOP..DINGBAT NEGATIVE CIRCLED SANS-SERIF DIGIT ZERO
|
||||
1F110..1F12E ; Common # So [31] PARENTHESIZED LATIN CAPITAL LETTER A..CIRCLED WZ
|
||||
1F130..1F16B ; Common # So [60] SQUARED LATIN CAPITAL LETTER A..RAISED MD SIGN
|
||||
1F110..1F16B ; Common # So [92] PARENTHESIZED LATIN CAPITAL LETTER A..RAISED MD SIGN
|
||||
1F170..1F1AC ; Common # So [61] NEGATIVE SQUARED LATIN CAPITAL LETTER A..SQUARED VOD
|
||||
1F1E6..1F1FF ; Common # So [26] REGIONAL INDICATOR SYMBOL LETTER A..REGIONAL INDICATOR SYMBOL LETTER Z
|
||||
1F201..1F202 ; Common # So [2] SQUARED KATAKANA KOKO..SQUARED KATAKANA SA
|
||||
|
@ -585,9 +588,9 @@ FFFC..FFFD ; Common # So [2] OBJECT REPLACEMENT CHARACTER..REPLACEMENT CHAR
|
|||
1F3FB..1F3FF ; Common # Sk [5] EMOJI MODIFIER FITZPATRICK TYPE-1-2..EMOJI MODIFIER FITZPATRICK TYPE-6
|
||||
1F400..1F6D4 ; Common # So [725] RAT..PAGODA
|
||||
1F6E0..1F6EC ; Common # So [13] HAMMER AND WRENCH..AIRPLANE ARRIVING
|
||||
1F6F0..1F6F8 ; Common # So [9] SATELLITE..FLYING SAUCER
|
||||
1F6F0..1F6F9 ; Common # So [10] SATELLITE..SKATEBOARD
|
||||
1F700..1F773 ; Common # So [116] ALCHEMICAL SYMBOL FOR QUINTESSENCE..ALCHEMICAL SYMBOL FOR HALF OUNCE
|
||||
1F780..1F7D4 ; Common # So [85] BLACK LEFT-POINTING ISOSCELES RIGHT TRIANGLE..HEAVY TWELVE POINTED PINWHEEL STAR
|
||||
1F780..1F7D8 ; Common # So [89] BLACK LEFT-POINTING ISOSCELES RIGHT TRIANGLE..NEGATIVE CIRCLED SQUARE
|
||||
1F800..1F80B ; Common # So [12] LEFTWARDS ARROW WITH SMALL TRIANGLE ARROWHEAD..DOWNWARDS ARROW WITH LARGE TRIANGLE ARROWHEAD
|
||||
1F810..1F847 ; Common # So [56] LEFTWARDS ARROW WITH SMALL EQUILATERAL ARROWHEAD..DOWNWARDS HEAVY ARROW
|
||||
1F850..1F859 ; Common # So [10] LEFTWARDS SANS-SERIF ARROW..UP DOWN SANS-SERIF ARROW
|
||||
|
@ -595,15 +598,18 @@ FFFC..FFFD ; Common # So [2] OBJECT REPLACEMENT CHARACTER..REPLACEMENT CHAR
|
|||
1F890..1F8AD ; Common # So [30] LEFTWARDS TRIANGLE ARROWHEAD..WHITE ARROW SHAFT WIDTH TWO THIRDS
|
||||
1F900..1F90B ; Common # So [12] CIRCLED CROSS FORMEE WITH FOUR DOTS..DOWNWARD FACING NOTCHED HOOK WITH DOT
|
||||
1F910..1F93E ; Common # So [47] ZIPPER-MOUTH FACE..HANDBALL
|
||||
1F940..1F94C ; Common # So [13] WILTED FLOWER..CURLING STONE
|
||||
1F950..1F96B ; Common # So [28] CROISSANT..CANNED FOOD
|
||||
1F980..1F997 ; Common # So [24] CRAB..CRICKET
|
||||
1F9C0 ; Common # So CHEESE WEDGE
|
||||
1F9D0..1F9E6 ; Common # So [23] FACE WITH MONOCLE..SOCKS
|
||||
1F940..1F970 ; Common # So [49] WILTED FLOWER..SMILING FACE WITH SMILING EYES AND THREE HEARTS
|
||||
1F973..1F976 ; Common # So [4] FACE WITH PARTY HORN AND PARTY HAT..FREEZING FACE
|
||||
1F97A ; Common # So FACE WITH PLEADING EYES
|
||||
1F97C..1F9A2 ; Common # So [39] LAB COAT..SWAN
|
||||
1F9B0..1F9B9 ; Common # So [10] EMOJI COMPONENT RED HAIR..SUPERVILLAIN
|
||||
1F9C0..1F9C2 ; Common # So [3] CHEESE WEDGE..SALT SHAKER
|
||||
1F9D0..1F9FF ; Common # So [48] FACE WITH MONOCLE..NAZAR AMULET
|
||||
1FA60..1FA6D ; Common # So [14] XIANGQI RED GENERAL..XIANGQI BLACK SOLDIER
|
||||
E0001 ; Common # Cf LANGUAGE TAG
|
||||
E0020..E007F ; Common # Cf [96] TAG SPACE..CANCEL TAG
|
||||
|
||||
# Total code points: 7363
|
||||
# Total code points: 7591
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -646,8 +652,7 @@ A770 ; Latin # Lm MODIFIER LETTER US
|
|||
A771..A787 ; Latin # L& [23] LATIN SMALL LETTER DUM..LATIN SMALL LETTER INSULAR T
|
||||
A78B..A78E ; Latin # L& [4] LATIN CAPITAL LETTER SALTILLO..LATIN SMALL LETTER L WITH RETROFLEX HOOK AND BELT
|
||||
A78F ; Latin # Lo LATIN LETTER SINOLOGICAL DOT
|
||||
A790..A7AE ; Latin # L& [31] LATIN CAPITAL LETTER N WITH DESCENDER..LATIN CAPITAL LETTER SMALL CAPITAL I
|
||||
A7B0..A7B7 ; Latin # L& [8] LATIN CAPITAL LETTER TURNED K..LATIN SMALL LETTER OMEGA
|
||||
A790..A7B9 ; Latin # L& [42] LATIN CAPITAL LETTER N WITH DESCENDER..LATIN SMALL LETTER U WITH STROKE
|
||||
A7F7 ; Latin # Lo LATIN EPIGRAPHIC LETTER SIDEWAYS I
|
||||
A7F8..A7F9 ; Latin # Lm [2] MODIFIER LETTER CAPITAL H WITH STROKE..MODIFIER LETTER SMALL LIGATURE OE
|
||||
A7FA ; Latin # L& LATIN LETTER SMALL CAPITAL TURNED M
|
||||
|
@ -659,7 +664,7 @@ FB00..FB06 ; Latin # L& [7] LATIN SMALL LIGATURE FF..LATIN SMALL LIGATURE S
|
|||
FF21..FF3A ; Latin # L& [26] FULLWIDTH LATIN CAPITAL LETTER A..FULLWIDTH LATIN CAPITAL LETTER Z
|
||||
FF41..FF5A ; Latin # L& [26] FULLWIDTH LATIN SMALL LETTER A..FULLWIDTH LATIN SMALL LETTER Z
|
||||
|
||||
# Total code points: 1350
|
||||
# Total code points: 1353
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -753,13 +758,13 @@ FE2E..FE2F ; Cyrillic # Mn [2] COMBINING CYRILLIC TITLO LEFT HALF..COMBININ
|
|||
0531..0556 ; Armenian # L& [38] ARMENIAN CAPITAL LETTER AYB..ARMENIAN CAPITAL LETTER FEH
|
||||
0559 ; Armenian # Lm ARMENIAN MODIFIER LETTER LEFT HALF RING
|
||||
055A..055F ; Armenian # Po [6] ARMENIAN APOSTROPHE..ARMENIAN ABBREVIATION MARK
|
||||
0561..0587 ; Armenian # L& [39] ARMENIAN SMALL LETTER AYB..ARMENIAN SMALL LIGATURE ECH YIWN
|
||||
0560..0588 ; Armenian # L& [41] ARMENIAN SMALL LETTER TURNED AYB..ARMENIAN SMALL LETTER YI WITH STROKE
|
||||
058A ; Armenian # Pd ARMENIAN HYPHEN
|
||||
058D..058E ; Armenian # So [2] RIGHT-FACING ARMENIAN ETERNITY SIGN..LEFT-FACING ARMENIAN ETERNITY SIGN
|
||||
058F ; Armenian # Sc ARMENIAN DRAM SIGN
|
||||
FB13..FB17 ; Armenian # L& [5] ARMENIAN SMALL LIGATURE MEN NOW..ARMENIAN SMALL LIGATURE MEN XEH
|
||||
|
||||
# Total code points: 93
|
||||
# Total code points: 95
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -773,7 +778,7 @@ FB13..FB17 ; Armenian # L& [5] ARMENIAN SMALL LIGATURE MEN NOW..ARMENIAN SM
|
|||
05C6 ; Hebrew # Po HEBREW PUNCTUATION NUN HAFUKHA
|
||||
05C7 ; Hebrew # Mn HEBREW POINT QAMATS QATAN
|
||||
05D0..05EA ; Hebrew # Lo [27] HEBREW LETTER ALEF..HEBREW LETTER TAV
|
||||
05F0..05F2 ; Hebrew # Lo [3] HEBREW LIGATURE YIDDISH DOUBLE VAV..HEBREW LIGATURE YIDDISH DOUBLE YOD
|
||||
05EF..05F2 ; Hebrew # Lo [4] HEBREW YOD TRIANGLE..HEBREW LIGATURE YIDDISH DOUBLE YOD
|
||||
05F3..05F4 ; Hebrew # Po [2] HEBREW PUNCTUATION GERESH..HEBREW PUNCTUATION GERSHAYIM
|
||||
FB1D ; Hebrew # Lo HEBREW LETTER YOD WITH HIRIQ
|
||||
FB1E ; Hebrew # Mn HEBREW POINT JUDEO-SPANISH VARIKA
|
||||
|
@ -786,7 +791,7 @@ FB40..FB41 ; Hebrew # Lo [2] HEBREW LETTER NUN WITH DAGESH..HEBREW LETTER S
|
|||
FB43..FB44 ; Hebrew # Lo [2] HEBREW LETTER FINAL PE WITH DAGESH..HEBREW LETTER PE WITH DAGESH
|
||||
FB46..FB4F ; Hebrew # Lo [10] HEBREW LETTER TSADI WITH DAGESH..HEBREW LIGATURE ALEF LAMED
|
||||
|
||||
# Total code points: 133
|
||||
# Total code points: 134
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -823,7 +828,7 @@ FB46..FB4F ; Hebrew # Lo [10] HEBREW LETTER TSADI WITH DAGESH..HEBREW LIGATU
|
|||
0750..077F ; Arabic # Lo [48] ARABIC LETTER BEH WITH THREE DOTS HORIZONTALLY BELOW..ARABIC LETTER KAF WITH TWO DOTS ABOVE
|
||||
08A0..08B4 ; Arabic # Lo [21] ARABIC LETTER BEH WITH SMALL V BELOW..ARABIC LETTER KAF WITH DOT BELOW
|
||||
08B6..08BD ; Arabic # Lo [8] ARABIC LETTER BEH WITH SMALL MEEM ABOVE..ARABIC LETTER AFRICAN NOON
|
||||
08D4..08E1 ; Arabic # Mn [14] ARABIC SMALL HIGH WORD AR-RUB..ARABIC SMALL HIGH SIGN SAFHA
|
||||
08D3..08E1 ; Arabic # Mn [15] ARABIC SMALL LOW WAW..ARABIC SMALL HIGH SIGN SAFHA
|
||||
08E3..08FF ; Arabic # Mn [29] ARABIC TURNED DAMMA BELOW..ARABIC MARK SIDEWAYS NOON GHUNNA
|
||||
FB50..FBB1 ; Arabic # Lo [98] ARABIC LETTER ALEF WASLA ISOLATED FORM..ARABIC LETTER YEH BARREE WITH HAMZA ABOVE FINAL FORM
|
||||
FBB2..FBC1 ; Arabic # Sk [16] ARABIC SYMBOL DOT ABOVE..ARABIC SYMBOL SMALL TAH BELOW
|
||||
|
@ -871,7 +876,7 @@ FE76..FEFC ; Arabic # Lo [135] ARABIC FATHA ISOLATED FORM..ARABIC LIGATURE LA
|
|||
1EEAB..1EEBB ; Arabic # Lo [17] ARABIC MATHEMATICAL DOUBLE-STRUCK LAM..ARABIC MATHEMATICAL DOUBLE-STRUCK GHAIN
|
||||
1EEF0..1EEF1 ; Arabic # Sm [2] ARABIC MATHEMATICAL OPERATOR MEEM WITH HAH WITH TATWEEL..ARABIC MATHEMATICAL OPERATOR HAH WITH DAL
|
||||
|
||||
# Total code points: 1280
|
||||
# Total code points: 1281
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -921,9 +926,10 @@ A8F2..A8F7 ; Devanagari # Lo [6] DEVANAGARI SIGN SPACING CANDRABINDU..DEVAN
|
|||
A8F8..A8FA ; Devanagari # Po [3] DEVANAGARI SIGN PUSHPIKA..DEVANAGARI CARET
|
||||
A8FB ; Devanagari # Lo DEVANAGARI HEADSTROKE
|
||||
A8FC ; Devanagari # Po DEVANAGARI SIGN SIDDHAM
|
||||
A8FD ; Devanagari # Lo DEVANAGARI JAIN OM
|
||||
A8FD..A8FE ; Devanagari # Lo [2] DEVANAGARI JAIN OM..DEVANAGARI LETTER AY
|
||||
A8FF ; Devanagari # Mn DEVANAGARI VOWEL SIGN AY
|
||||
|
||||
# Total code points: 154
|
||||
# Total code points: 156
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -956,8 +962,9 @@ A8FD ; Devanagari # Lo DEVANAGARI JAIN OM
|
|||
09FB ; Bengali # Sc BENGALI GANDA MARK
|
||||
09FC ; Bengali # Lo BENGALI LETTER VEDIC ANUSVARA
|
||||
09FD ; Bengali # Po BENGALI ABBREVIATION SIGN
|
||||
09FE ; Bengali # Mn BENGALI SANDHI MARK
|
||||
|
||||
# Total code points: 95
|
||||
# Total code points: 96
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -982,8 +989,9 @@ A8FD ; Devanagari # Lo DEVANAGARI JAIN OM
|
|||
0A70..0A71 ; Gurmukhi # Mn [2] GURMUKHI TIPPI..GURMUKHI ADDAK
|
||||
0A72..0A74 ; Gurmukhi # Lo [3] GURMUKHI IRI..GURMUKHI EK ONKAR
|
||||
0A75 ; Gurmukhi # Mn GURMUKHI SIGN YAKASH
|
||||
0A76 ; Gurmukhi # Po GURMUKHI ABBREVIATION SIGN
|
||||
|
||||
# Total code points: 79
|
||||
# Total code points: 80
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -1078,6 +1086,7 @@ A8FD ; Devanagari # Lo DEVANAGARI JAIN OM
|
|||
|
||||
0C00 ; Telugu # Mn TELUGU SIGN COMBINING CANDRABINDU ABOVE
|
||||
0C01..0C03 ; Telugu # Mc [3] TELUGU SIGN CANDRABINDU..TELUGU SIGN VISARGA
|
||||
0C04 ; Telugu # Mn TELUGU SIGN COMBINING ANUSVARA ABOVE
|
||||
0C05..0C0C ; Telugu # Lo [8] TELUGU LETTER A..TELUGU LETTER VOCALIC L
|
||||
0C0E..0C10 ; Telugu # Lo [3] TELUGU LETTER E..TELUGU LETTER AI
|
||||
0C12..0C28 ; Telugu # Lo [23] TELUGU LETTER O..TELUGU LETTER NA
|
||||
|
@ -1095,13 +1104,14 @@ A8FD ; Devanagari # Lo DEVANAGARI JAIN OM
|
|||
0C78..0C7E ; Telugu # No [7] TELUGU FRACTION DIGIT ZERO FOR ODD POWERS OF FOUR..TELUGU FRACTION DIGIT THREE FOR EVEN POWERS OF FOUR
|
||||
0C7F ; Telugu # So TELUGU SIGN TUUMU
|
||||
|
||||
# Total code points: 96
|
||||
# Total code points: 97
|
||||
|
||||
# ================================================
|
||||
|
||||
0C80 ; Kannada # Lo KANNADA SIGN SPACING CANDRABINDU
|
||||
0C81 ; Kannada # Mn KANNADA SIGN CANDRABINDU
|
||||
0C82..0C83 ; Kannada # Mc [2] KANNADA SIGN ANUSVARA..KANNADA SIGN VISARGA
|
||||
0C84 ; Kannada # Po KANNADA SIGN SIDDHAM
|
||||
0C85..0C8C ; Kannada # Lo [8] KANNADA LETTER A..KANNADA LETTER VOCALIC L
|
||||
0C8E..0C90 ; Kannada # Lo [3] KANNADA LETTER E..KANNADA LETTER AI
|
||||
0C92..0CA8 ; Kannada # Lo [23] KANNADA LETTER O..KANNADA LETTER NA
|
||||
|
@ -1123,7 +1133,7 @@ A8FD ; Devanagari # Lo DEVANAGARI JAIN OM
|
|||
0CE6..0CEF ; Kannada # Nd [10] KANNADA DIGIT ZERO..KANNADA DIGIT NINE
|
||||
0CF1..0CF2 ; Kannada # Lo [2] KANNADA SIGN JIHVAMULIYA..KANNADA SIGN UPADHMANIYA
|
||||
|
||||
# Total code points: 88
|
||||
# Total code points: 89
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -1317,14 +1327,16 @@ AA7E..AA7F ; Myanmar # Lo [2] MYANMAR LETTER SHWE PALAUNG CHA..MYANMAR LETT
|
|||
10A0..10C5 ; Georgian # L& [38] GEORGIAN CAPITAL LETTER AN..GEORGIAN CAPITAL LETTER HOE
|
||||
10C7 ; Georgian # L& GEORGIAN CAPITAL LETTER YN
|
||||
10CD ; Georgian # L& GEORGIAN CAPITAL LETTER AEN
|
||||
10D0..10FA ; Georgian # Lo [43] GEORGIAN LETTER AN..GEORGIAN LETTER AIN
|
||||
10D0..10FA ; Georgian # L& [43] GEORGIAN LETTER AN..GEORGIAN LETTER AIN
|
||||
10FC ; Georgian # Lm MODIFIER LETTER GEORGIAN NAR
|
||||
10FD..10FF ; Georgian # Lo [3] GEORGIAN LETTER AEN..GEORGIAN LETTER LABIAL SIGN
|
||||
10FD..10FF ; Georgian # L& [3] GEORGIAN LETTER AEN..GEORGIAN LETTER LABIAL SIGN
|
||||
1C90..1CBA ; Georgian # L& [43] GEORGIAN MTAVRULI CAPITAL LETTER AN..GEORGIAN MTAVRULI CAPITAL LETTER AIN
|
||||
1CBD..1CBF ; Georgian # L& [3] GEORGIAN MTAVRULI CAPITAL LETTER AEN..GEORGIAN MTAVRULI CAPITAL LETTER LABIAL SIGN
|
||||
2D00..2D25 ; Georgian # L& [38] GEORGIAN SMALL LETTER AN..GEORGIAN SMALL LETTER HOE
|
||||
2D27 ; Georgian # L& GEORGIAN SMALL LETTER YN
|
||||
2D2D ; Georgian # L& GEORGIAN SMALL LETTER AEN
|
||||
|
||||
# Total code points: 127
|
||||
# Total code points: 173
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -1453,7 +1465,7 @@ AB70..ABBF ; Cherokee # L& [80] CHEROKEE SMALL LETTER A..CHEROKEE SMALL LETT
|
|||
1810..1819 ; Mongolian # Nd [10] MONGOLIAN DIGIT ZERO..MONGOLIAN DIGIT NINE
|
||||
1820..1842 ; Mongolian # Lo [35] MONGOLIAN LETTER A..MONGOLIAN LETTER CHI
|
||||
1843 ; Mongolian # Lm MONGOLIAN LETTER TODO LONG VOWEL SIGN
|
||||
1844..1877 ; Mongolian # Lo [52] MONGOLIAN LETTER TODO E..MONGOLIAN LETTER MANCHU ZHA
|
||||
1844..1878 ; Mongolian # Lo [53] MONGOLIAN LETTER TODO E..MONGOLIAN LETTER CHA WITH TWO DOTS
|
||||
1880..1884 ; Mongolian # Lo [5] MONGOLIAN LETTER ALI GALI ANUSVARA ONE..MONGOLIAN LETTER ALI GALI INVERTED UBADAMA
|
||||
1885..1886 ; Mongolian # Mn [2] MONGOLIAN LETTER ALI GALI BALUDA..MONGOLIAN LETTER ALI GALI THREE BALUDA
|
||||
1887..18A8 ; Mongolian # Lo [34] MONGOLIAN LETTER ALI GALI A..MONGOLIAN LETTER MANCHU ALI GALI BHA
|
||||
|
@ -1461,7 +1473,7 @@ AB70..ABBF ; Cherokee # L& [80] CHEROKEE SMALL LETTER A..CHEROKEE SMALL LETT
|
|||
18AA ; Mongolian # Lo MONGOLIAN LETTER MANCHU ALI GALI LHA
|
||||
11660..1166C ; Mongolian # Po [13] MONGOLIAN BIRGA WITH ORNAMENT..MONGOLIAN TURNED SWIRL BIRGA WITH DOUBLE ORNAMENT
|
||||
|
||||
# Total code points: 166
|
||||
# Total code points: 167
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -1490,10 +1502,10 @@ FF71..FF9D ; Katakana # Lo [45] HALFWIDTH KATAKANA LETTER A..HALFWIDTH KATAK
|
|||
# ================================================
|
||||
|
||||
02EA..02EB ; Bopomofo # Sk [2] MODIFIER LETTER YIN DEPARTING TONE MARK..MODIFIER LETTER YANG DEPARTING TONE MARK
|
||||
3105..312E ; Bopomofo # Lo [42] BOPOMOFO LETTER B..BOPOMOFO LETTER O WITH DOT ABOVE
|
||||
3105..312F ; Bopomofo # Lo [43] BOPOMOFO LETTER B..BOPOMOFO LETTER NN
|
||||
31A0..31BA ; Bopomofo # Lo [27] BOPOMOFO LETTER BU..BOPOMOFO LETTER ZY
|
||||
|
||||
# Total code points: 71
|
||||
# Total code points: 72
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -1506,7 +1518,7 @@ FF71..FF9D ; Katakana # Lo [45] HALFWIDTH KATAKANA LETTER A..HALFWIDTH KATAK
|
|||
3038..303A ; Han # Nl [3] HANGZHOU NUMERAL TEN..HANGZHOU NUMERAL THIRTY
|
||||
303B ; Han # Lm VERTICAL IDEOGRAPHIC ITERATION MARK
|
||||
3400..4DB5 ; Han # Lo [6582] CJK UNIFIED IDEOGRAPH-3400..CJK UNIFIED IDEOGRAPH-4DB5
|
||||
4E00..9FEA ; Han # Lo [20971] CJK UNIFIED IDEOGRAPH-4E00..CJK UNIFIED IDEOGRAPH-9FEA
|
||||
4E00..9FEF ; Han # Lo [20976] CJK UNIFIED IDEOGRAPH-4E00..CJK UNIFIED IDEOGRAPH-9FEF
|
||||
F900..FA6D ; Han # Lo [366] CJK COMPATIBILITY IDEOGRAPH-F900..CJK COMPATIBILITY IDEOGRAPH-FA6D
|
||||
FA70..FAD9 ; Han # Lo [106] CJK COMPATIBILITY IDEOGRAPH-FA70..CJK COMPATIBILITY IDEOGRAPH-FAD9
|
||||
20000..2A6D6 ; Han # Lo [42711] CJK UNIFIED IDEOGRAPH-20000..CJK UNIFIED IDEOGRAPH-2A6D6
|
||||
|
@ -1516,7 +1528,7 @@ FA70..FAD9 ; Han # Lo [106] CJK COMPATIBILITY IDEOGRAPH-FA70..CJK COMPATIBILI
|
|||
2CEB0..2EBE0 ; Han # Lo [7473] CJK UNIFIED IDEOGRAPH-2CEB0..CJK UNIFIED IDEOGRAPH-2EBE0
|
||||
2F800..2FA1D ; Han # Lo [542] CJK COMPATIBILITY IDEOGRAPH-2F800..CJK COMPATIBILITY IDEOGRAPH-2FA1D
|
||||
|
||||
# Total code points: 89228
|
||||
# Total code points: 89233
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -1579,13 +1591,14 @@ FE00..FE0F ; Inherited # Mn [16] VARIATION SELECTOR-1..VARIATION SELECTOR-16
|
|||
FE20..FE2D ; Inherited # Mn [14] COMBINING LIGATURE LEFT HALF..COMBINING CONJOINING MACRON BELOW
|
||||
101FD ; Inherited # Mn PHAISTOS DISC SIGN COMBINING OBLIQUE STROKE
|
||||
102E0 ; Inherited # Mn COPTIC EPACT THOUSANDS MARK
|
||||
1133B ; Inherited # Mn COMBINING BINDU BELOW
|
||||
1D167..1D169 ; Inherited # Mn [3] MUSICAL SYMBOL COMBINING TREMOLO-1..MUSICAL SYMBOL COMBINING TREMOLO-3
|
||||
1D17B..1D182 ; Inherited # Mn [8] MUSICAL SYMBOL COMBINING ACCENT..MUSICAL SYMBOL COMBINING LOURE
|
||||
1D185..1D18B ; Inherited # Mn [7] MUSICAL SYMBOL COMBINING DOIT..MUSICAL SYMBOL COMBINING TRIPLE TONGUE
|
||||
1D1AA..1D1AD ; Inherited # Mn [4] MUSICAL SYMBOL COMBINING DOWN BOW..MUSICAL SYMBOL COMBINING SNAP PIZZICATO
|
||||
E0100..E01EF ; Inherited # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256
|
||||
|
||||
# Total code points: 568
|
||||
# Total code points: 569
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -1778,13 +1791,13 @@ A828..A82B ; Syloti_Nagri # So [4] SYLOTI NAGRI POETRY MARK-1..SYLOTI NAGRI
|
|||
10A0C..10A0F ; Kharoshthi # Mn [4] KHAROSHTHI VOWEL LENGTH MARK..KHAROSHTHI SIGN VISARGA
|
||||
10A10..10A13 ; Kharoshthi # Lo [4] KHAROSHTHI LETTER KA..KHAROSHTHI LETTER GHA
|
||||
10A15..10A17 ; Kharoshthi # Lo [3] KHAROSHTHI LETTER CA..KHAROSHTHI LETTER JA
|
||||
10A19..10A33 ; Kharoshthi # Lo [27] KHAROSHTHI LETTER NYA..KHAROSHTHI LETTER TTTHA
|
||||
10A19..10A35 ; Kharoshthi # Lo [29] KHAROSHTHI LETTER NYA..KHAROSHTHI LETTER VHA
|
||||
10A38..10A3A ; Kharoshthi # Mn [3] KHAROSHTHI SIGN BAR ABOVE..KHAROSHTHI SIGN DOT BELOW
|
||||
10A3F ; Kharoshthi # Mn KHAROSHTHI VIRAMA
|
||||
10A40..10A47 ; Kharoshthi # No [8] KHAROSHTHI DIGIT ONE..KHAROSHTHI NUMBER ONE THOUSAND
|
||||
10A40..10A48 ; Kharoshthi # No [9] KHAROSHTHI DIGIT ONE..KHAROSHTHI FRACTION ONE HALF
|
||||
10A50..10A58 ; Kharoshthi # Po [9] KHAROSHTHI PUNCTUATION DOT..KHAROSHTHI PUNCTUATION LINES
|
||||
|
||||
# Total code points: 65
|
||||
# Total code points: 68
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -1841,8 +1854,10 @@ A874..A877 ; Phags_Pa # Po [4] PHAGS-PA SINGLE HEAD MARK..PHAGS-PA MARK DOU
|
|||
07F6 ; Nko # So NKO SYMBOL OO DENNEN
|
||||
07F7..07F9 ; Nko # Po [3] NKO SYMBOL GBAKURUNEN..NKO EXCLAMATION MARK
|
||||
07FA ; Nko # Lm NKO LAJANYALAN
|
||||
07FD ; Nko # Mn NKO DANTAYALAN
|
||||
07FE..07FF ; Nko # Sc [2] NKO DOROME SIGN..NKO TAMAN SIGN
|
||||
|
||||
# Total code points: 59
|
||||
# Total code points: 62
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -2137,8 +2152,9 @@ ABF0..ABF9 ; Meetei_Mayek # Nd [10] MEETEI MAYEK DIGIT ZERO..MEETEI MAYEK DI
|
|||
110BB..110BC ; Kaithi # Po [2] KAITHI ABBREVIATION SIGN..KAITHI ENUMERATION SIGN
|
||||
110BD ; Kaithi # Cf KAITHI NUMBER SIGN
|
||||
110BE..110C1 ; Kaithi # Po [4] KAITHI SECTION MARK..KAITHI DOUBLE DANDA
|
||||
110CD ; Kaithi # Cf KAITHI NUMBER SIGN ABOVE
|
||||
|
||||
# Total code points: 66
|
||||
# Total code points: 67
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -2186,8 +2202,10 @@ ABF0..ABF9 ; Meetei_Mayek # Nd [10] MEETEI MAYEK DIGIT ZERO..MEETEI MAYEK DI
|
|||
1112D..11134 ; Chakma # Mn [8] CHAKMA VOWEL SIGN AI..CHAKMA MAAYYAA
|
||||
11136..1113F ; Chakma # Nd [10] CHAKMA DIGIT ZERO..CHAKMA DIGIT NINE
|
||||
11140..11143 ; Chakma # Po [4] CHAKMA SECTION MARK..CHAKMA QUESTION MARK
|
||||
11144 ; Chakma # Lo CHAKMA LETTER LHAA
|
||||
11145..11146 ; Chakma # Mc [2] CHAKMA VOWEL SIGN AA..CHAKMA VOWEL SIGN EI
|
||||
|
||||
# Total code points: 67
|
||||
# Total code points: 70
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -2224,8 +2242,8 @@ ABF0..ABF9 ; Meetei_Mayek # Nd [10] MEETEI MAYEK DIGIT ZERO..MEETEI MAYEK DI
|
|||
111B6..111BE ; Sharada # Mn [9] SHARADA VOWEL SIGN U..SHARADA VOWEL SIGN O
|
||||
111BF..111C0 ; Sharada # Mc [2] SHARADA VOWEL SIGN AU..SHARADA SIGN VIRAMA
|
||||
111C1..111C4 ; Sharada # Lo [4] SHARADA SIGN AVAGRAHA..SHARADA OM
|
||||
111C5..111C9 ; Sharada # Po [5] SHARADA DANDA..SHARADA SANDHI MARK
|
||||
111CA..111CC ; Sharada # Mn [3] SHARADA SIGN NUKTA..SHARADA EXTRA SHORT VOWEL MARK
|
||||
111C5..111C8 ; Sharada # Po [4] SHARADA DANDA..SHARADA SEPARATOR
|
||||
111C9..111CC ; Sharada # Mn [4] SHARADA SANDHI MARK..SHARADA EXTRA SHORT VOWEL MARK
|
||||
111CD ; Sharada # Po SHARADA SUTRA MARK
|
||||
111D0..111D9 ; Sharada # Nd [10] SHARADA DIGIT ZERO..SHARADA DIGIT NINE
|
||||
111DA ; Sharada # Lo SHARADA EKAM
|
||||
|
@ -2502,7 +2520,7 @@ ABF0..ABF9 ; Meetei_Mayek # Nd [10] MEETEI MAYEK DIGIT ZERO..MEETEI MAYEK DI
|
|||
|
||||
# ================================================
|
||||
|
||||
11700..11719 ; Ahom # Lo [26] AHOM LETTER KA..AHOM LETTER JHA
|
||||
11700..1171A ; Ahom # Lo [27] AHOM LETTER KA..AHOM LETTER ALTERNATE BA
|
||||
1171D..1171F ; Ahom # Mn [3] AHOM CONSONANT SIGN MEDIAL LA..AHOM CONSONANT SIGN MEDIAL LIGATING RA
|
||||
11720..11721 ; Ahom # Mc [2] AHOM VOWEL SIGN A..AHOM VOWEL SIGN AA
|
||||
11722..11725 ; Ahom # Mn [4] AHOM VOWEL SIGN I..AHOM VOWEL SIGN UU
|
||||
|
@ -2513,7 +2531,7 @@ ABF0..ABF9 ; Meetei_Mayek # Nd [10] MEETEI MAYEK DIGIT ZERO..MEETEI MAYEK DI
|
|||
1173C..1173E ; Ahom # Po [3] AHOM SIGN SMALL SECTION..AHOM SIGN RULAI
|
||||
1173F ; Ahom # So AHOM SYMBOL VI
|
||||
|
||||
# Total code points: 57
|
||||
# Total code points: 58
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -2618,8 +2636,9 @@ ABF0..ABF9 ; Meetei_Mayek # Nd [10] MEETEI MAYEK DIGIT ZERO..MEETEI MAYEK DI
|
|||
11450..11459 ; Newa # Nd [10] NEWA DIGIT ZERO..NEWA DIGIT NINE
|
||||
1145B ; Newa # Po NEWA PLACEHOLDER MARK
|
||||
1145D ; Newa # Po NEWA INSERTION SIGN
|
||||
1145E ; Newa # Mn NEWA SANDHI MARK
|
||||
|
||||
# Total code points: 92
|
||||
# Total code points: 93
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -2631,10 +2650,10 @@ ABF0..ABF9 ; Meetei_Mayek # Nd [10] MEETEI MAYEK DIGIT ZERO..MEETEI MAYEK DI
|
|||
# ================================================
|
||||
|
||||
16FE0 ; Tangut # Lm TANGUT ITERATION MARK
|
||||
17000..187EC ; Tangut # Lo [6125] TANGUT IDEOGRAPH-17000..TANGUT IDEOGRAPH-187EC
|
||||
17000..187F1 ; Tangut # Lo [6130] TANGUT IDEOGRAPH-17000..TANGUT IDEOGRAPH-187F1
|
||||
18800..18AF2 ; Tangut # Lo [755] TANGUT COMPONENT-001..TANGUT COMPONENT-755
|
||||
|
||||
# Total code points: 6881
|
||||
# Total code points: 6886
|
||||
|
||||
# ================================================
|
||||
|
||||
|
@ -2670,16 +2689,15 @@ ABF0..ABF9 ; Meetei_Mayek # Nd [10] MEETEI MAYEK DIGIT ZERO..MEETEI MAYEK DI
|
|||
11A97 ; Soyombo # Mc SOYOMBO SIGN VISARGA
|
||||
11A98..11A99 ; Soyombo # Mn [2] SOYOMBO GEMINATION MARK..SOYOMBO SUBJOINER
|
||||
11A9A..11A9C ; Soyombo # Po [3] SOYOMBO MARK TSHEG..SOYOMBO MARK DOUBLE SHAD
|
||||
11A9D ; Soyombo # Lo SOYOMBO MARK PLUTA
|
||||
11A9E..11AA2 ; Soyombo # Po [5] SOYOMBO HEAD MARK WITH MOON AND SUN AND TRIPLE FLAME..SOYOMBO TERMINAL MARK-2
|
||||
|
||||
# Total code points: 80
|
||||
# Total code points: 81
|
||||
|
||||
# ================================================
|
||||
|
||||
11A00 ; Zanabazar_Square # Lo ZANABAZAR SQUARE LETTER A
|
||||
11A01..11A06 ; Zanabazar_Square # Mn [6] ZANABAZAR SQUARE VOWEL SIGN I..ZANABAZAR SQUARE VOWEL SIGN O
|
||||
11A07..11A08 ; Zanabazar_Square # Mc [2] ZANABAZAR SQUARE VOWEL SIGN AI..ZANABAZAR SQUARE VOWEL SIGN AU
|
||||
11A09..11A0A ; Zanabazar_Square # Mn [2] ZANABAZAR SQUARE VOWEL SIGN REVERSED I..ZANABAZAR SQUARE VOWEL LENGTH MARK
|
||||
11A01..11A0A ; Zanabazar_Square # Mn [10] ZANABAZAR SQUARE VOWEL SIGN I..ZANABAZAR SQUARE VOWEL LENGTH MARK
|
||||
11A0B..11A32 ; Zanabazar_Square # Lo [40] ZANABAZAR SQUARE LETTER KA..ZANABAZAR SQUARE LETTER KSSA
|
||||
11A33..11A38 ; Zanabazar_Square # Mn [6] ZANABAZAR SQUARE FINAL CONSONANT MARK..ZANABAZAR SQUARE SIGN ANUSVARA
|
||||
11A39 ; Zanabazar_Square # Mc ZANABAZAR SQUARE SIGN VISARGA
|
||||
|
@ -2690,4 +2708,73 @@ ABF0..ABF9 ; Meetei_Mayek # Nd [10] MEETEI MAYEK DIGIT ZERO..MEETEI MAYEK DI
|
|||
|
||||
# Total code points: 72
|
||||
|
||||
# ================================================
|
||||
|
||||
11800..1182B ; Dogra # Lo [44] DOGRA LETTER A..DOGRA LETTER RRA
|
||||
1182C..1182E ; Dogra # Mc [3] DOGRA VOWEL SIGN AA..DOGRA VOWEL SIGN II
|
||||
1182F..11837 ; Dogra # Mn [9] DOGRA VOWEL SIGN U..DOGRA SIGN ANUSVARA
|
||||
11838 ; Dogra # Mc DOGRA SIGN VISARGA
|
||||
11839..1183A ; Dogra # Mn [2] DOGRA SIGN VIRAMA..DOGRA SIGN NUKTA
|
||||
1183B ; Dogra # Po DOGRA ABBREVIATION SIGN
|
||||
|
||||
# Total code points: 60
|
||||
|
||||
# ================================================
|
||||
|
||||
11D60..11D65 ; Gunjala_Gondi # Lo [6] GUNJALA GONDI LETTER A..GUNJALA GONDI LETTER UU
|
||||
11D67..11D68 ; Gunjala_Gondi # Lo [2] GUNJALA GONDI LETTER EE..GUNJALA GONDI LETTER AI
|
||||
11D6A..11D89 ; Gunjala_Gondi # Lo [32] GUNJALA GONDI LETTER OO..GUNJALA GONDI LETTER SA
|
||||
11D8A..11D8E ; Gunjala_Gondi # Mc [5] GUNJALA GONDI VOWEL SIGN AA..GUNJALA GONDI VOWEL SIGN UU
|
||||
11D90..11D91 ; Gunjala_Gondi # Mn [2] GUNJALA GONDI VOWEL SIGN EE..GUNJALA GONDI VOWEL SIGN AI
|
||||
11D93..11D94 ; Gunjala_Gondi # Mc [2] GUNJALA GONDI VOWEL SIGN OO..GUNJALA GONDI VOWEL SIGN AU
|
||||
11D95 ; Gunjala_Gondi # Mn GUNJALA GONDI SIGN ANUSVARA
|
||||
11D96 ; Gunjala_Gondi # Mc GUNJALA GONDI SIGN VISARGA
|
||||
11D97 ; Gunjala_Gondi # Mn GUNJALA GONDI VIRAMA
|
||||
11D98 ; Gunjala_Gondi # Lo GUNJALA GONDI OM
|
||||
11DA0..11DA9 ; Gunjala_Gondi # Nd [10] GUNJALA GONDI DIGIT ZERO..GUNJALA GONDI DIGIT NINE
|
||||
|
||||
# Total code points: 63
|
||||
|
||||
# ================================================
|
||||
|
||||
11EE0..11EF2 ; Makasar # Lo [19] MAKASAR LETTER KA..MAKASAR ANGKA
|
||||
11EF3..11EF4 ; Makasar # Mn [2] MAKASAR VOWEL SIGN I..MAKASAR VOWEL SIGN U
|
||||
11EF5..11EF6 ; Makasar # Mc [2] MAKASAR VOWEL SIGN E..MAKASAR VOWEL SIGN O
|
||||
11EF7..11EF8 ; Makasar # Po [2] MAKASAR PASSIMBANG..MAKASAR END OF SECTION
|
||||
|
||||
# Total code points: 25
|
||||
|
||||
# ================================================
|
||||
|
||||
16E40..16E7F ; Medefaidrin # L& [64] MEDEFAIDRIN CAPITAL LETTER M..MEDEFAIDRIN SMALL LETTER Y
|
||||
16E80..16E96 ; Medefaidrin # No [23] MEDEFAIDRIN DIGIT ZERO..MEDEFAIDRIN DIGIT THREE ALTERNATE FORM
|
||||
16E97..16E9A ; Medefaidrin # Po [4] MEDEFAIDRIN COMMA..MEDEFAIDRIN EXCLAMATION OH
|
||||
|
||||
# Total code points: 91
|
||||
|
||||
# ================================================
|
||||
|
||||
10D00..10D23 ; Hanifi_Rohingya # Lo [36] HANIFI ROHINGYA LETTER A..HANIFI ROHINGYA MARK NA KHONNA
|
||||
10D24..10D27 ; Hanifi_Rohingya # Mn [4] HANIFI ROHINGYA SIGN HARBAHAY..HANIFI ROHINGYA SIGN TASSI
|
||||
10D30..10D39 ; Hanifi_Rohingya # Nd [10] HANIFI ROHINGYA DIGIT ZERO..HANIFI ROHINGYA DIGIT NINE
|
||||
|
||||
# Total code points: 50
|
||||
|
||||
# ================================================
|
||||
|
||||
10F30..10F45 ; Sogdian # Lo [22] SOGDIAN LETTER ALEPH..SOGDIAN INDEPENDENT SHIN
|
||||
10F46..10F50 ; Sogdian # Mn [11] SOGDIAN COMBINING DOT BELOW..SOGDIAN COMBINING STROKE BELOW
|
||||
10F51..10F54 ; Sogdian # No [4] SOGDIAN NUMBER ONE..SOGDIAN NUMBER ONE HUNDRED
|
||||
10F55..10F59 ; Sogdian # Po [5] SOGDIAN PUNCTUATION TWO VERTICAL BARS..SOGDIAN PUNCTUATION HALF CIRCLE WITH DOT
|
||||
|
||||
# Total code points: 42
|
||||
|
||||
# ================================================
|
||||
|
||||
10F00..10F1C ; Old_Sogdian # Lo [29] OLD SOGDIAN LETTER ALEPH..OLD SOGDIAN LETTER FINAL TAW WITH VERTICAL TAIL
|
||||
10F1D..10F26 ; Old_Sogdian # No [10] OLD SOGDIAN NUMBER ONE..OLD SOGDIAN FRACTION ONE HALF
|
||||
10F27 ; Old_Sogdian # Lo OLD SOGDIAN LIGATURE AYIN-DALETH
|
||||
|
||||
# Total code points: 40
|
||||
|
||||
# EOF
|
||||
|
|
File diff suppressed because it is too large
Load Diff
|
@ -0,0 +1,714 @@
|
|||
# emoji-data.txt
|
||||
# Date: 2018-02-07, 07:55:18 GMT
|
||||
# © 2018 Unicode®, Inc.
|
||||
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
|
||||
# For terms of use, see http://www.unicode.org/terms_of_use.html
|
||||
#
|
||||
# Emoji Data for UTS #51
|
||||
# Version: 11.0
|
||||
#
|
||||
# For documentation and usage, see http://www.unicode.org/reports/tr51
|
||||
#
|
||||
# Format:
|
||||
# <codepoint(s)> ; <property> # <comments>
|
||||
# Note: there is no guarantee as to the structure of whitespace or comments
|
||||
#
|
||||
# Characters and sequences are listed in code point order. Users should be shown a more natural order.
|
||||
# See the CLDR collation order for Emoji.
|
||||
|
||||
|
||||
# ================================================
|
||||
|
||||
# All omitted code points have Emoji=No
|
||||
# @missing: 0000..10FFFF ; Emoji ; No
|
||||
|
||||
0023 ; Emoji # 1.1 [1] (#️) number sign
|
||||
002A ; Emoji # 1.1 [1] (*️) asterisk
|
||||
0030..0039 ; Emoji # 1.1 [10] (0️..9️) digit zero..digit nine
|
||||
00A9 ; Emoji # 1.1 [1] (©️) copyright
|
||||
00AE ; Emoji # 1.1 [1] (®️) registered
|
||||
203C ; Emoji # 1.1 [1] (‼️) double exclamation mark
|
||||
2049 ; Emoji # 3.0 [1] (⁉️) exclamation question mark
|
||||
2122 ; Emoji # 1.1 [1] (™️) trade mark
|
||||
2139 ; Emoji # 3.0 [1] (ℹ️) information
|
||||
2194..2199 ; Emoji # 1.1 [6] (↔️..↙️) left-right arrow..down-left arrow
|
||||
21A9..21AA ; Emoji # 1.1 [2] (↩️..↪️) right arrow curving left..left arrow curving right
|
||||
231A..231B ; Emoji # 1.1 [2] (⌚..⌛) watch..hourglass done
|
||||
2328 ; Emoji # 1.1 [1] (⌨️) keyboard
|
||||
23CF ; Emoji # 4.0 [1] (⏏️) eject button
|
||||
23E9..23F3 ; Emoji # 6.0 [11] (⏩..⏳) fast-forward button..hourglass not done
|
||||
23F8..23FA ; Emoji # 7.0 [3] (⏸️..⏺️) pause button..record button
|
||||
24C2 ; Emoji # 1.1 [1] (Ⓜ️) circled M
|
||||
25AA..25AB ; Emoji # 1.1 [2] (▪️..▫️) black small square..white small square
|
||||
25B6 ; Emoji # 1.1 [1] (▶️) play button
|
||||
25C0 ; Emoji # 1.1 [1] (◀️) reverse button
|
||||
25FB..25FE ; Emoji # 3.2 [4] (◻️..◾) white medium square..black medium-small square
|
||||
2600..2604 ; Emoji # 1.1 [5] (☀️..☄️) sun..comet
|
||||
260E ; Emoji # 1.1 [1] (☎️) telephone
|
||||
2611 ; Emoji # 1.1 [1] (☑️) ballot box with check
|
||||
2614..2615 ; Emoji # 4.0 [2] (☔..☕) umbrella with rain drops..hot beverage
|
||||
2618 ; Emoji # 4.1 [1] (☘️) shamrock
|
||||
261D ; Emoji # 1.1 [1] (☝️) index pointing up
|
||||
2620 ; Emoji # 1.1 [1] (☠️) skull and crossbones
|
||||
2622..2623 ; Emoji # 1.1 [2] (☢️..☣️) radioactive..biohazard
|
||||
2626 ; Emoji # 1.1 [1] (☦️) orthodox cross
|
||||
262A ; Emoji # 1.1 [1] (☪️) star and crescent
|
||||
262E..262F ; Emoji # 1.1 [2] (☮️..☯️) peace symbol..yin yang
|
||||
2638..263A ; Emoji # 1.1 [3] (☸️..☺️) wheel of dharma..smiling face
|
||||
2640 ; Emoji # 1.1 [1] (♀️) female sign
|
||||
2642 ; Emoji # 1.1 [1] (♂️) male sign
|
||||
2648..2653 ; Emoji # 1.1 [12] (♈..♓) Aries..Pisces
|
||||
265F..2660 ; Emoji # 1.1 [2] (♟️..♠️) chess pawn..spade suit
|
||||
2663 ; Emoji # 1.1 [1] (♣️) club suit
|
||||
2665..2666 ; Emoji # 1.1 [2] (♥️..♦️) heart suit..diamond suit
|
||||
2668 ; Emoji # 1.1 [1] (♨️) hot springs
|
||||
267B ; Emoji # 3.2 [1] (♻️) recycling symbol
|
||||
267E..267F ; Emoji # 4.1 [2] (♾️..♿) infinity..wheelchair symbol
|
||||
2692..2697 ; Emoji # 4.1 [6] (⚒️..⚗️) hammer and pick..alembic
|
||||
2699 ; Emoji # 4.1 [1] (⚙️) gear
|
||||
269B..269C ; Emoji # 4.1 [2] (⚛️..⚜️) atom symbol..fleur-de-lis
|
||||
26A0..26A1 ; Emoji # 4.0 [2] (⚠️..⚡) warning..high voltage
|
||||
26AA..26AB ; Emoji # 4.1 [2] (⚪..⚫) white circle..black circle
|
||||
26B0..26B1 ; Emoji # 4.1 [2] (⚰️..⚱️) coffin..funeral urn
|
||||
26BD..26BE ; Emoji # 5.2 [2] (⚽..⚾) soccer ball..baseball
|
||||
26C4..26C5 ; Emoji # 5.2 [2] (⛄..⛅) snowman without snow..sun behind cloud
|
||||
26C8 ; Emoji # 5.2 [1] (⛈️) cloud with lightning and rain
|
||||
26CE ; Emoji # 6.0 [1] (⛎) Ophiuchus
|
||||
26CF ; Emoji # 5.2 [1] (⛏️) pick
|
||||
26D1 ; Emoji # 5.2 [1] (⛑️) rescue worker’s helmet
|
||||
26D3..26D4 ; Emoji # 5.2 [2] (⛓️..⛔) chains..no entry
|
||||
26E9..26EA ; Emoji # 5.2 [2] (⛩️..⛪) shinto shrine..church
|
||||
26F0..26F5 ; Emoji # 5.2 [6] (⛰️..⛵) mountain..sailboat
|
||||
26F7..26FA ; Emoji # 5.2 [4] (⛷️..⛺) skier..tent
|
||||
26FD ; Emoji # 5.2 [1] (⛽) fuel pump
|
||||
2702 ; Emoji # 1.1 [1] (✂️) scissors
|
||||
2705 ; Emoji # 6.0 [1] (✅) white heavy check mark
|
||||
2708..2709 ; Emoji # 1.1 [2] (✈️..✉️) airplane..envelope
|
||||
270A..270B ; Emoji # 6.0 [2] (✊..✋) raised fist..raised hand
|
||||
270C..270D ; Emoji # 1.1 [2] (✌️..✍️) victory hand..writing hand
|
||||
270F ; Emoji # 1.1 [1] (✏️) pencil
|
||||
2712 ; Emoji # 1.1 [1] (✒️) black nib
|
||||
2714 ; Emoji # 1.1 [1] (✔️) heavy check mark
|
||||
2716 ; Emoji # 1.1 [1] (✖️) heavy multiplication x
|
||||
271D ; Emoji # 1.1 [1] (✝️) latin cross
|
||||
2721 ; Emoji # 1.1 [1] (✡️) star of David
|
||||
2728 ; Emoji # 6.0 [1] (✨) sparkles
|
||||
2733..2734 ; Emoji # 1.1 [2] (✳️..✴️) eight-spoked asterisk..eight-pointed star
|
||||
2744 ; Emoji # 1.1 [1] (❄️) snowflake
|
||||
2747 ; Emoji # 1.1 [1] (❇️) sparkle
|
||||
274C ; Emoji # 6.0 [1] (❌) cross mark
|
||||
274E ; Emoji # 6.0 [1] (❎) cross mark button
|
||||
2753..2755 ; Emoji # 6.0 [3] (❓..❕) question mark..white exclamation mark
|
||||
2757 ; Emoji # 5.2 [1] (❗) exclamation mark
|
||||
2763..2764 ; Emoji # 1.1 [2] (❣️..❤️) heavy heart exclamation..red heart
|
||||
2795..2797 ; Emoji # 6.0 [3] (➕..➗) heavy plus sign..heavy division sign
|
||||
27A1 ; Emoji # 1.1 [1] (➡️) right arrow
|
||||
27B0 ; Emoji # 6.0 [1] (➰) curly loop
|
||||
27BF ; Emoji # 6.0 [1] (➿) double curly loop
|
||||
2934..2935 ; Emoji # 3.2 [2] (⤴️..⤵️) right arrow curving up..right arrow curving down
|
||||
2B05..2B07 ; Emoji # 4.0 [3] (⬅️..⬇️) left arrow..down arrow
|
||||
2B1B..2B1C ; Emoji # 5.1 [2] (⬛..⬜) black large square..white large square
|
||||
2B50 ; Emoji # 5.1 [1] (⭐) star
|
||||
2B55 ; Emoji # 5.2 [1] (⭕) heavy large circle
|
||||
3030 ; Emoji # 1.1 [1] (〰️) wavy dash
|
||||
303D ; Emoji # 3.2 [1] (〽️) part alternation mark
|
||||
3297 ; Emoji # 1.1 [1] (㊗️) Japanese “congratulations” button
|
||||
3299 ; Emoji # 1.1 [1] (㊙️) Japanese “secret” button
|
||||
1F004 ; Emoji # 5.1 [1] (🀄) mahjong red dragon
|
||||
1F0CF ; Emoji # 6.0 [1] (🃏) joker
|
||||
1F170..1F171 ; Emoji # 6.0 [2] (🅰️..🅱️) A button (blood type)..B button (blood type)
|
||||
1F17E ; Emoji # 6.0 [1] (🅾️) O button (blood type)
|
||||
1F17F ; Emoji # 5.2 [1] (🅿️) P button
|
||||
1F18E ; Emoji # 6.0 [1] (🆎) AB button (blood type)
|
||||
1F191..1F19A ; Emoji # 6.0 [10] (🆑..🆚) CL button..VS button
|
||||
1F1E6..1F1FF ; Emoji # 6.0 [26] (🇦..🇿) regional indicator symbol letter a..regional indicator symbol letter z
|
||||
1F201..1F202 ; Emoji # 6.0 [2] (🈁..🈂️) Japanese “here” button..Japanese “service charge” button
|
||||
1F21A ; Emoji # 5.2 [1] (🈚) Japanese “free of charge” button
|
||||
1F22F ; Emoji # 5.2 [1] (🈯) Japanese “reserved” button
|
||||
1F232..1F23A ; Emoji # 6.0 [9] (🈲..🈺) Japanese “prohibited” button..Japanese “open for business” button
|
||||
1F250..1F251 ; Emoji # 6.0 [2] (🉐..🉑) Japanese “bargain” button..Japanese “acceptable” button
|
||||
1F300..1F320 ; Emoji # 6.0 [33] (🌀..🌠) cyclone..shooting star
|
||||
1F321 ; Emoji # 7.0 [1] (🌡️) thermometer
|
||||
1F324..1F32C ; Emoji # 7.0 [9] (🌤️..🌬️) sun behind small cloud..wind face
|
||||
1F32D..1F32F ; Emoji # 8.0 [3] (🌭..🌯) hot dog..burrito
|
||||
1F330..1F335 ; Emoji # 6.0 [6] (🌰..🌵) chestnut..cactus
|
||||
1F336 ; Emoji # 7.0 [1] (🌶️) hot pepper
|
||||
1F337..1F37C ; Emoji # 6.0 [70] (🌷..🍼) tulip..baby bottle
|
||||
1F37D ; Emoji # 7.0 [1] (🍽️) fork and knife with plate
|
||||
1F37E..1F37F ; Emoji # 8.0 [2] (🍾..🍿) bottle with popping cork..popcorn
|
||||
1F380..1F393 ; Emoji # 6.0 [20] (🎀..🎓) ribbon..graduation cap
|
||||
1F396..1F397 ; Emoji # 7.0 [2] (🎖️..🎗️) military medal..reminder ribbon
|
||||
1F399..1F39B ; Emoji # 7.0 [3] (🎙️..🎛️) studio microphone..control knobs
|
||||
1F39E..1F39F ; Emoji # 7.0 [2] (🎞️..🎟️) film frames..admission tickets
|
||||
1F3A0..1F3C4 ; Emoji # 6.0 [37] (🎠..🏄) carousel horse..person surfing
|
||||
1F3C5 ; Emoji # 7.0 [1] (🏅) sports medal
|
||||
1F3C6..1F3CA ; Emoji # 6.0 [5] (🏆..🏊) trophy..person swimming
|
||||
1F3CB..1F3CE ; Emoji # 7.0 [4] (🏋️..🏎️) person lifting weights..racing car
|
||||
1F3CF..1F3D3 ; Emoji # 8.0 [5] (🏏..🏓) cricket game..ping pong
|
||||
1F3D4..1F3DF ; Emoji # 7.0 [12] (🏔️..🏟️) snow-capped mountain..stadium
|
||||
1F3E0..1F3F0 ; Emoji # 6.0 [17] (🏠..🏰) house..castle
|
||||
1F3F3..1F3F5 ; Emoji # 7.0 [3] (🏳️..🏵️) white flag..rosette
|
||||
1F3F7 ; Emoji # 7.0 [1] (🏷️) label
|
||||
1F3F8..1F3FF ; Emoji # 8.0 [8] (🏸..🏿) badminton..dark skin tone
|
||||
1F400..1F43E ; Emoji # 6.0 [63] (🐀..🐾) rat..paw prints
|
||||
1F43F ; Emoji # 7.0 [1] (🐿️) chipmunk
|
||||
1F440 ; Emoji # 6.0 [1] (👀) eyes
|
||||
1F441 ; Emoji # 7.0 [1] (👁️) eye
|
||||
1F442..1F4F7 ; Emoji # 6.0[182] (👂..📷) ear..camera
|
||||
1F4F8 ; Emoji # 7.0 [1] (📸) camera with flash
|
||||
1F4F9..1F4FC ; Emoji # 6.0 [4] (📹..📼) video camera..videocassette
|
||||
1F4FD ; Emoji # 7.0 [1] (📽️) film projector
|
||||
1F4FF ; Emoji # 8.0 [1] (📿) prayer beads
|
||||
1F500..1F53D ; Emoji # 6.0 [62] (🔀..🔽) shuffle tracks button..downwards button
|
||||
1F549..1F54A ; Emoji # 7.0 [2] (🕉️..🕊️) om..dove
|
||||
1F54B..1F54E ; Emoji # 8.0 [4] (🕋..🕎) kaaba..menorah
|
||||
1F550..1F567 ; Emoji # 6.0 [24] (🕐..🕧) one o’clock..twelve-thirty
|
||||
1F56F..1F570 ; Emoji # 7.0 [2] (🕯️..🕰️) candle..mantelpiece clock
|
||||
1F573..1F579 ; Emoji # 7.0 [7] (🕳️..🕹️) hole..joystick
|
||||
1F57A ; Emoji # 9.0 [1] (🕺) man dancing
|
||||
1F587 ; Emoji # 7.0 [1] (🖇️) linked paperclips
|
||||
1F58A..1F58D ; Emoji # 7.0 [4] (🖊️..🖍️) pen..crayon
|
||||
1F590 ; Emoji # 7.0 [1] (🖐️) hand with fingers splayed
|
||||
1F595..1F596 ; Emoji # 7.0 [2] (🖕..🖖) middle finger..vulcan salute
|
||||
1F5A4 ; Emoji # 9.0 [1] (🖤) black heart
|
||||
1F5A5 ; Emoji # 7.0 [1] (🖥️) desktop computer
|
||||
1F5A8 ; Emoji # 7.0 [1] (🖨️) printer
|
||||
1F5B1..1F5B2 ; Emoji # 7.0 [2] (🖱️..🖲️) computer mouse..trackball
|
||||
1F5BC ; Emoji # 7.0 [1] (🖼️) framed picture
|
||||
1F5C2..1F5C4 ; Emoji # 7.0 [3] (🗂️..🗄️) card index dividers..file cabinet
|
||||
1F5D1..1F5D3 ; Emoji # 7.0 [3] (🗑️..🗓️) wastebasket..spiral calendar
|
||||
1F5DC..1F5DE ; Emoji # 7.0 [3] (🗜️..🗞️) clamp..rolled-up newspaper
|
||||
1F5E1 ; Emoji # 7.0 [1] (🗡️) dagger
|
||||
1F5E3 ; Emoji # 7.0 [1] (🗣️) speaking head
|
||||
1F5E8 ; Emoji # 7.0 [1] (🗨️) left speech bubble
|
||||
1F5EF ; Emoji # 7.0 [1] (🗯️) right anger bubble
|
||||
1F5F3 ; Emoji # 7.0 [1] (🗳️) ballot box with ballot
|
||||
1F5FA ; Emoji # 7.0 [1] (🗺️) world map
|
||||
1F5FB..1F5FF ; Emoji # 6.0 [5] (🗻..🗿) mount fuji..moai
|
||||
1F600 ; Emoji # 6.1 [1] (😀) grinning face
|
||||
1F601..1F610 ; Emoji # 6.0 [16] (😁..😐) beaming face with smiling eyes..neutral face
|
||||
1F611 ; Emoji # 6.1 [1] (😑) expressionless face
|
||||
1F612..1F614 ; Emoji # 6.0 [3] (😒..😔) unamused face..pensive face
|
||||
1F615 ; Emoji # 6.1 [1] (😕) confused face
|
||||
1F616 ; Emoji # 6.0 [1] (😖) confounded face
|
||||
1F617 ; Emoji # 6.1 [1] (😗) kissing face
|
||||
1F618 ; Emoji # 6.0 [1] (😘) face blowing a kiss
|
||||
1F619 ; Emoji # 6.1 [1] (😙) kissing face with smiling eyes
|
||||
1F61A ; Emoji # 6.0 [1] (😚) kissing face with closed eyes
|
||||
1F61B ; Emoji # 6.1 [1] (😛) face with tongue
|
||||
1F61C..1F61E ; Emoji # 6.0 [3] (😜..😞) winking face with tongue..disappointed face
|
||||
1F61F ; Emoji # 6.1 [1] (😟) worried face
|
||||
1F620..1F625 ; Emoji # 6.0 [6] (😠..😥) angry face..sad but relieved face
|
||||
1F626..1F627 ; Emoji # 6.1 [2] (😦..😧) frowning face with open mouth..anguished face
|
||||
1F628..1F62B ; Emoji # 6.0 [4] (😨..😫) fearful face..tired face
|
||||
1F62C ; Emoji # 6.1 [1] (😬) grimacing face
|
||||
1F62D ; Emoji # 6.0 [1] (😭) loudly crying face
|
||||
1F62E..1F62F ; Emoji # 6.1 [2] (😮..😯) face with open mouth..hushed face
|
||||
1F630..1F633 ; Emoji # 6.0 [4] (😰..😳) anxious face with sweat..flushed face
|
||||
1F634 ; Emoji # 6.1 [1] (😴) sleeping face
|
||||
1F635..1F640 ; Emoji # 6.0 [12] (😵..🙀) dizzy face..weary cat face
|
||||
1F641..1F642 ; Emoji # 7.0 [2] (🙁..🙂) slightly frowning face..slightly smiling face
|
||||
1F643..1F644 ; Emoji # 8.0 [2] (🙃..🙄) upside-down face..face with rolling eyes
|
||||
1F645..1F64F ; Emoji # 6.0 [11] (🙅..🙏) person gesturing NO..folded hands
|
||||
1F680..1F6C5 ; Emoji # 6.0 [70] (🚀..🛅) rocket..left luggage
|
||||
1F6CB..1F6CF ; Emoji # 7.0 [5] (🛋️..🛏️) couch and lamp..bed
|
||||
1F6D0 ; Emoji # 8.0 [1] (🛐) place of worship
|
||||
1F6D1..1F6D2 ; Emoji # 9.0 [2] (🛑..🛒) stop sign..shopping cart
|
||||
1F6E0..1F6E5 ; Emoji # 7.0 [6] (🛠️..🛥️) hammer and wrench..motor boat
|
||||
1F6E9 ; Emoji # 7.0 [1] (🛩️) small airplane
|
||||
1F6EB..1F6EC ; Emoji # 7.0 [2] (🛫..🛬) airplane departure..airplane arrival
|
||||
1F6F0 ; Emoji # 7.0 [1] (🛰️) satellite
|
||||
1F6F3 ; Emoji # 7.0 [1] (🛳️) passenger ship
|
||||
1F6F4..1F6F6 ; Emoji # 9.0 [3] (🛴..🛶) kick scooter..canoe
|
||||
1F6F7..1F6F8 ; Emoji # 10.0 [2] (🛷..🛸) sled..flying saucer
|
||||
1F6F9 ; Emoji # 11.0 [1] (🛹) skateboard
|
||||
1F910..1F918 ; Emoji # 8.0 [9] (🤐..🤘) zipper-mouth face..sign of the horns
|
||||
1F919..1F91E ; Emoji # 9.0 [6] (🤙..🤞) call me hand..crossed fingers
|
||||
1F91F ; Emoji # 10.0 [1] (🤟) love-you gesture
|
||||
1F920..1F927 ; Emoji # 9.0 [8] (🤠..🤧) cowboy hat face..sneezing face
|
||||
1F928..1F92F ; Emoji # 10.0 [8] (🤨..🤯) face with raised eyebrow..exploding head
|
||||
1F930 ; Emoji # 9.0 [1] (🤰) pregnant woman
|
||||
1F931..1F932 ; Emoji # 10.0 [2] (🤱..🤲) breast-feeding..palms up together
|
||||
1F933..1F93A ; Emoji # 9.0 [8] (🤳..🤺) selfie..person fencing
|
||||
1F93C..1F93E ; Emoji # 9.0 [3] (🤼..🤾) people wrestling..person playing handball
|
||||
1F940..1F945 ; Emoji # 9.0 [6] (🥀..🥅) wilted flower..goal net
|
||||
1F947..1F94B ; Emoji # 9.0 [5] (🥇..🥋) 1st place medal..martial arts uniform
|
||||
1F94C ; Emoji # 10.0 [1] (🥌) curling stone
|
||||
1F94D..1F94F ; Emoji # 11.0 [3] (🥍..🥏) lacrosse..flying disc
|
||||
1F950..1F95E ; Emoji # 9.0 [15] (🥐..🥞) croissant..pancakes
|
||||
1F95F..1F96B ; Emoji # 10.0 [13] (🥟..🥫) dumpling..canned food
|
||||
1F96C..1F970 ; Emoji # 11.0 [5] (🥬..🥰) leafy green..smiling face with 3 hearts
|
||||
1F973..1F976 ; Emoji # 11.0 [4] (🥳..🥶) partying face..cold face
|
||||
1F97A ; Emoji # 11.0 [1] (🥺) pleading face
|
||||
1F97C..1F97F ; Emoji # 11.0 [4] (🥼..🥿) lab coat..woman’s flat shoe
|
||||
1F980..1F984 ; Emoji # 8.0 [5] (🦀..🦄) crab..unicorn face
|
||||
1F985..1F991 ; Emoji # 9.0 [13] (🦅..🦑) eagle..squid
|
||||
1F992..1F997 ; Emoji # 10.0 [6] (🦒..🦗) giraffe..cricket
|
||||
1F998..1F9A2 ; Emoji # 11.0 [11] (🦘..🦢) kangaroo..swan
|
||||
1F9B0..1F9B9 ; Emoji # 11.0 [10] (🦰..🦹) red-haired..supervillain
|
||||
1F9C0 ; Emoji # 8.0 [1] (🧀) cheese wedge
|
||||
1F9C1..1F9C2 ; Emoji # 11.0 [2] (🧁..🧂) cupcake..salt
|
||||
1F9D0..1F9E6 ; Emoji # 10.0 [23] (🧐..🧦) face with monocle..socks
|
||||
1F9E7..1F9FF ; Emoji # 11.0 [25] (🧧..🧿) red envelope..nazar amulet
|
||||
|
||||
# Total elements: 1250
|
||||
|
||||
# ================================================
|
||||
|
||||
# All omitted code points have Emoji_Presentation=No
|
||||
# @missing: 0000..10FFFF ; Emoji_Presentation ; No
|
||||
|
||||
231A..231B ; Emoji_Presentation # 1.1 [2] (⌚..⌛) watch..hourglass done
|
||||
23E9..23EC ; Emoji_Presentation # 6.0 [4] (⏩..⏬) fast-forward button..fast down button
|
||||
23F0 ; Emoji_Presentation # 6.0 [1] (⏰) alarm clock
|
||||
23F3 ; Emoji_Presentation # 6.0 [1] (⏳) hourglass not done
|
||||
25FD..25FE ; Emoji_Presentation # 3.2 [2] (◽..◾) white medium-small square..black medium-small square
|
||||
2614..2615 ; Emoji_Presentation # 4.0 [2] (☔..☕) umbrella with rain drops..hot beverage
|
||||
2648..2653 ; Emoji_Presentation # 1.1 [12] (♈..♓) Aries..Pisces
|
||||
267F ; Emoji_Presentation # 4.1 [1] (♿) wheelchair symbol
|
||||
2693 ; Emoji_Presentation # 4.1 [1] (⚓) anchor
|
||||
26A1 ; Emoji_Presentation # 4.0 [1] (⚡) high voltage
|
||||
26AA..26AB ; Emoji_Presentation # 4.1 [2] (⚪..⚫) white circle..black circle
|
||||
26BD..26BE ; Emoji_Presentation # 5.2 [2] (⚽..⚾) soccer ball..baseball
|
||||
26C4..26C5 ; Emoji_Presentation # 5.2 [2] (⛄..⛅) snowman without snow..sun behind cloud
|
||||
26CE ; Emoji_Presentation # 6.0 [1] (⛎) Ophiuchus
|
||||
26D4 ; Emoji_Presentation # 5.2 [1] (⛔) no entry
|
||||
26EA ; Emoji_Presentation # 5.2 [1] (⛪) church
|
||||
26F2..26F3 ; Emoji_Presentation # 5.2 [2] (⛲..⛳) fountain..flag in hole
|
||||
26F5 ; Emoji_Presentation # 5.2 [1] (⛵) sailboat
|
||||
26FA ; Emoji_Presentation # 5.2 [1] (⛺) tent
|
||||
26FD ; Emoji_Presentation # 5.2 [1] (⛽) fuel pump
|
||||
2705 ; Emoji_Presentation # 6.0 [1] (✅) white heavy check mark
|
||||
270A..270B ; Emoji_Presentation # 6.0 [2] (✊..✋) raised fist..raised hand
|
||||
2728 ; Emoji_Presentation # 6.0 [1] (✨) sparkles
|
||||
274C ; Emoji_Presentation # 6.0 [1] (❌) cross mark
|
||||
274E ; Emoji_Presentation # 6.0 [1] (❎) cross mark button
|
||||
2753..2755 ; Emoji_Presentation # 6.0 [3] (❓..❕) question mark..white exclamation mark
|
||||
2757 ; Emoji_Presentation # 5.2 [1] (❗) exclamation mark
|
||||
2795..2797 ; Emoji_Presentation # 6.0 [3] (➕..➗) heavy plus sign..heavy division sign
|
||||
27B0 ; Emoji_Presentation # 6.0 [1] (➰) curly loop
|
||||
27BF ; Emoji_Presentation # 6.0 [1] (➿) double curly loop
|
||||
2B1B..2B1C ; Emoji_Presentation # 5.1 [2] (⬛..⬜) black large square..white large square
|
||||
2B50 ; Emoji_Presentation # 5.1 [1] (⭐) star
|
||||
2B55 ; Emoji_Presentation # 5.2 [1] (⭕) heavy large circle
|
||||
1F004 ; Emoji_Presentation # 5.1 [1] (🀄) mahjong red dragon
|
||||
1F0CF ; Emoji_Presentation # 6.0 [1] (🃏) joker
|
||||
1F18E ; Emoji_Presentation # 6.0 [1] (🆎) AB button (blood type)
|
||||
1F191..1F19A ; Emoji_Presentation # 6.0 [10] (🆑..🆚) CL button..VS button
|
||||
1F1E6..1F1FF ; Emoji_Presentation # 6.0 [26] (🇦..🇿) regional indicator symbol letter a..regional indicator symbol letter z
|
||||
1F201 ; Emoji_Presentation # 6.0 [1] (🈁) Japanese “here” button
|
||||
1F21A ; Emoji_Presentation # 5.2 [1] (🈚) Japanese “free of charge” button
|
||||
1F22F ; Emoji_Presentation # 5.2 [1] (🈯) Japanese “reserved” button
|
||||
1F232..1F236 ; Emoji_Presentation # 6.0 [5] (🈲..🈶) Japanese “prohibited” button..Japanese “not free of charge” button
|
||||
1F238..1F23A ; Emoji_Presentation # 6.0 [3] (🈸..🈺) Japanese “application” button..Japanese “open for business” button
|
||||
1F250..1F251 ; Emoji_Presentation # 6.0 [2] (🉐..🉑) Japanese “bargain” button..Japanese “acceptable” button
|
||||
1F300..1F320 ; Emoji_Presentation # 6.0 [33] (🌀..🌠) cyclone..shooting star
|
||||
1F32D..1F32F ; Emoji_Presentation # 8.0 [3] (🌭..🌯) hot dog..burrito
|
||||
1F330..1F335 ; Emoji_Presentation # 6.0 [6] (🌰..🌵) chestnut..cactus
|
||||
1F337..1F37C ; Emoji_Presentation # 6.0 [70] (🌷..🍼) tulip..baby bottle
|
||||
1F37E..1F37F ; Emoji_Presentation # 8.0 [2] (🍾..🍿) bottle with popping cork..popcorn
|
||||
1F380..1F393 ; Emoji_Presentation # 6.0 [20] (🎀..🎓) ribbon..graduation cap
|
||||
1F3A0..1F3C4 ; Emoji_Presentation # 6.0 [37] (🎠..🏄) carousel horse..person surfing
|
||||
1F3C5 ; Emoji_Presentation # 7.0 [1] (🏅) sports medal
|
||||
1F3C6..1F3CA ; Emoji_Presentation # 6.0 [5] (🏆..🏊) trophy..person swimming
|
||||
1F3CF..1F3D3 ; Emoji_Presentation # 8.0 [5] (🏏..🏓) cricket game..ping pong
|
||||
1F3E0..1F3F0 ; Emoji_Presentation # 6.0 [17] (🏠..🏰) house..castle
|
||||
1F3F4 ; Emoji_Presentation # 7.0 [1] (🏴) black flag
|
||||
1F3F8..1F3FF ; Emoji_Presentation # 8.0 [8] (🏸..🏿) badminton..dark skin tone
|
||||
1F400..1F43E ; Emoji_Presentation # 6.0 [63] (🐀..🐾) rat..paw prints
|
||||
1F440 ; Emoji_Presentation # 6.0 [1] (👀) eyes
|
||||
1F442..1F4F7 ; Emoji_Presentation # 6.0[182] (👂..📷) ear..camera
|
||||
1F4F8 ; Emoji_Presentation # 7.0 [1] (📸) camera with flash
|
||||
1F4F9..1F4FC ; Emoji_Presentation # 6.0 [4] (📹..📼) video camera..videocassette
|
||||
1F4FF ; Emoji_Presentation # 8.0 [1] (📿) prayer beads
|
||||
1F500..1F53D ; Emoji_Presentation # 6.0 [62] (🔀..🔽) shuffle tracks button..downwards button
|
||||
1F54B..1F54E ; Emoji_Presentation # 8.0 [4] (🕋..🕎) kaaba..menorah
|
||||
1F550..1F567 ; Emoji_Presentation # 6.0 [24] (🕐..🕧) one o’clock..twelve-thirty
|
||||
1F57A ; Emoji_Presentation # 9.0 [1] (🕺) man dancing
|
||||
1F595..1F596 ; Emoji_Presentation # 7.0 [2] (🖕..🖖) middle finger..vulcan salute
|
||||
1F5A4 ; Emoji_Presentation # 9.0 [1] (🖤) black heart
|
||||
1F5FB..1F5FF ; Emoji_Presentation # 6.0 [5] (🗻..🗿) mount fuji..moai
|
||||
1F600 ; Emoji_Presentation # 6.1 [1] (😀) grinning face
|
||||
1F601..1F610 ; Emoji_Presentation # 6.0 [16] (😁..😐) beaming face with smiling eyes..neutral face
|
||||
1F611 ; Emoji_Presentation # 6.1 [1] (😑) expressionless face
|
||||
1F612..1F614 ; Emoji_Presentation # 6.0 [3] (😒..😔) unamused face..pensive face
|
||||
1F615 ; Emoji_Presentation # 6.1 [1] (😕) confused face
|
||||
1F616 ; Emoji_Presentation # 6.0 [1] (😖) confounded face
|
||||
1F617 ; Emoji_Presentation # 6.1 [1] (😗) kissing face
|
||||
1F618 ; Emoji_Presentation # 6.0 [1] (😘) face blowing a kiss
|
||||
1F619 ; Emoji_Presentation # 6.1 [1] (😙) kissing face with smiling eyes
|
||||
1F61A ; Emoji_Presentation # 6.0 [1] (😚) kissing face with closed eyes
|
||||
1F61B ; Emoji_Presentation # 6.1 [1] (😛) face with tongue
|
||||
1F61C..1F61E ; Emoji_Presentation # 6.0 [3] (😜..😞) winking face with tongue..disappointed face
|
||||
1F61F ; Emoji_Presentation # 6.1 [1] (😟) worried face
|
||||
1F620..1F625 ; Emoji_Presentation # 6.0 [6] (😠..😥) angry face..sad but relieved face
|
||||
1F626..1F627 ; Emoji_Presentation # 6.1 [2] (😦..😧) frowning face with open mouth..anguished face
|
||||
1F628..1F62B ; Emoji_Presentation # 6.0 [4] (😨..😫) fearful face..tired face
|
||||
1F62C ; Emoji_Presentation # 6.1 [1] (😬) grimacing face
|
||||
1F62D ; Emoji_Presentation # 6.0 [1] (😭) loudly crying face
|
||||
1F62E..1F62F ; Emoji_Presentation # 6.1 [2] (😮..😯) face with open mouth..hushed face
|
||||
1F630..1F633 ; Emoji_Presentation # 6.0 [4] (😰..😳) anxious face with sweat..flushed face
|
||||
1F634 ; Emoji_Presentation # 6.1 [1] (😴) sleeping face
|
||||
1F635..1F640 ; Emoji_Presentation # 6.0 [12] (😵..🙀) dizzy face..weary cat face
|
||||
1F641..1F642 ; Emoji_Presentation # 7.0 [2] (🙁..🙂) slightly frowning face..slightly smiling face
|
||||
1F643..1F644 ; Emoji_Presentation # 8.0 [2] (🙃..🙄) upside-down face..face with rolling eyes
|
||||
1F645..1F64F ; Emoji_Presentation # 6.0 [11] (🙅..🙏) person gesturing NO..folded hands
|
||||
1F680..1F6C5 ; Emoji_Presentation # 6.0 [70] (🚀..🛅) rocket..left luggage
|
||||
1F6CC ; Emoji_Presentation # 7.0 [1] (🛌) person in bed
|
||||
1F6D0 ; Emoji_Presentation # 8.0 [1] (🛐) place of worship
|
||||
1F6D1..1F6D2 ; Emoji_Presentation # 9.0 [2] (🛑..🛒) stop sign..shopping cart
|
||||
1F6EB..1F6EC ; Emoji_Presentation # 7.0 [2] (🛫..🛬) airplane departure..airplane arrival
|
||||
1F6F4..1F6F6 ; Emoji_Presentation # 9.0 [3] (🛴..🛶) kick scooter..canoe
|
||||
1F6F7..1F6F8 ; Emoji_Presentation # 10.0 [2] (🛷..🛸) sled..flying saucer
|
||||
1F6F9 ; Emoji_Presentation # 11.0 [1] (🛹) skateboard
|
||||
1F910..1F918 ; Emoji_Presentation # 8.0 [9] (🤐..🤘) zipper-mouth face..sign of the horns
|
||||
1F919..1F91E ; Emoji_Presentation # 9.0 [6] (🤙..🤞) call me hand..crossed fingers
|
||||
1F91F ; Emoji_Presentation # 10.0 [1] (🤟) love-you gesture
|
||||
1F920..1F927 ; Emoji_Presentation # 9.0 [8] (🤠..🤧) cowboy hat face..sneezing face
|
||||
1F928..1F92F ; Emoji_Presentation # 10.0 [8] (🤨..🤯) face with raised eyebrow..exploding head
|
||||
1F930 ; Emoji_Presentation # 9.0 [1] (🤰) pregnant woman
|
||||
1F931..1F932 ; Emoji_Presentation # 10.0 [2] (🤱..🤲) breast-feeding..palms up together
|
||||
1F933..1F93A ; Emoji_Presentation # 9.0 [8] (🤳..🤺) selfie..person fencing
|
||||
1F93C..1F93E ; Emoji_Presentation # 9.0 [3] (🤼..🤾) people wrestling..person playing handball
|
||||
1F940..1F945 ; Emoji_Presentation # 9.0 [6] (🥀..🥅) wilted flower..goal net
|
||||
1F947..1F94B ; Emoji_Presentation # 9.0 [5] (🥇..🥋) 1st place medal..martial arts uniform
|
||||
1F94C ; Emoji_Presentation # 10.0 [1] (🥌) curling stone
|
||||
1F94D..1F94F ; Emoji_Presentation # 11.0 [3] (🥍..🥏) lacrosse..flying disc
|
||||
1F950..1F95E ; Emoji_Presentation # 9.0 [15] (🥐..🥞) croissant..pancakes
|
||||
1F95F..1F96B ; Emoji_Presentation # 10.0 [13] (🥟..🥫) dumpling..canned food
|
||||
1F96C..1F970 ; Emoji_Presentation # 11.0 [5] (🥬..🥰) leafy green..smiling face with 3 hearts
|
||||
1F973..1F976 ; Emoji_Presentation # 11.0 [4] (🥳..🥶) partying face..cold face
|
||||
1F97A ; Emoji_Presentation # 11.0 [1] (🥺) pleading face
|
||||
1F97C..1F97F ; Emoji_Presentation # 11.0 [4] (🥼..🥿) lab coat..woman’s flat shoe
|
||||
1F980..1F984 ; Emoji_Presentation # 8.0 [5] (🦀..🦄) crab..unicorn face
|
||||
1F985..1F991 ; Emoji_Presentation # 9.0 [13] (🦅..🦑) eagle..squid
|
||||
1F992..1F997 ; Emoji_Presentation # 10.0 [6] (🦒..🦗) giraffe..cricket
|
||||
1F998..1F9A2 ; Emoji_Presentation # 11.0 [11] (🦘..🦢) kangaroo..swan
|
||||
1F9B0..1F9B9 ; Emoji_Presentation # 11.0 [10] (🦰..🦹) red-haired..supervillain
|
||||
1F9C0 ; Emoji_Presentation # 8.0 [1] (🧀) cheese wedge
|
||||
1F9C1..1F9C2 ; Emoji_Presentation # 11.0 [2] (🧁..🧂) cupcake..salt
|
||||
1F9D0..1F9E6 ; Emoji_Presentation # 10.0 [23] (🧐..🧦) face with monocle..socks
|
||||
1F9E7..1F9FF ; Emoji_Presentation # 11.0 [25] (🧧..🧿) red envelope..nazar amulet
|
||||
|
||||
# Total elements: 1032
|
||||
|
||||
# ================================================
|
||||
|
||||
# All omitted code points have Emoji_Modifier=No
|
||||
# @missing: 0000..10FFFF ; Emoji_Modifier ; No
|
||||
|
||||
1F3FB..1F3FF ; Emoji_Modifier # 8.0 [5] (🏻..🏿) light skin tone..dark skin tone
|
||||
|
||||
# Total elements: 5
|
||||
|
||||
# ================================================
|
||||
|
||||
# All omitted code points have Emoji_Modifier_Base=No
|
||||
# @missing: 0000..10FFFF ; Emoji_Modifier_Base ; No
|
||||
|
||||
261D ; Emoji_Modifier_Base # 1.1 [1] (☝️) index pointing up
|
||||
26F9 ; Emoji_Modifier_Base # 5.2 [1] (⛹️) person bouncing ball
|
||||
270A..270B ; Emoji_Modifier_Base # 6.0 [2] (✊..✋) raised fist..raised hand
|
||||
270C..270D ; Emoji_Modifier_Base # 1.1 [2] (✌️..✍️) victory hand..writing hand
|
||||
1F385 ; Emoji_Modifier_Base # 6.0 [1] (🎅) Santa Claus
|
||||
1F3C2..1F3C4 ; Emoji_Modifier_Base # 6.0 [3] (🏂..🏄) snowboarder..person surfing
|
||||
1F3C7 ; Emoji_Modifier_Base # 6.0 [1] (🏇) horse racing
|
||||
1F3CA ; Emoji_Modifier_Base # 6.0 [1] (🏊) person swimming
|
||||
1F3CB..1F3CC ; Emoji_Modifier_Base # 7.0 [2] (🏋️..🏌️) person lifting weights..person golfing
|
||||
1F442..1F443 ; Emoji_Modifier_Base # 6.0 [2] (👂..👃) ear..nose
|
||||
1F446..1F450 ; Emoji_Modifier_Base # 6.0 [11] (👆..👐) backhand index pointing up..open hands
|
||||
1F466..1F469 ; Emoji_Modifier_Base # 6.0 [4] (👦..👩) boy..woman
|
||||
1F46E ; Emoji_Modifier_Base # 6.0 [1] (👮) police officer
|
||||
1F470..1F478 ; Emoji_Modifier_Base # 6.0 [9] (👰..👸) bride with veil..princess
|
||||
1F47C ; Emoji_Modifier_Base # 6.0 [1] (👼) baby angel
|
||||
1F481..1F483 ; Emoji_Modifier_Base # 6.0 [3] (💁..💃) person tipping hand..woman dancing
|
||||
1F485..1F487 ; Emoji_Modifier_Base # 6.0 [3] (💅..💇) nail polish..person getting haircut
|
||||
1F4AA ; Emoji_Modifier_Base # 6.0 [1] (💪) flexed biceps
|
||||
1F574..1F575 ; Emoji_Modifier_Base # 7.0 [2] (🕴️..🕵️) man in suit levitating..detective
|
||||
1F57A ; Emoji_Modifier_Base # 9.0 [1] (🕺) man dancing
|
||||
1F590 ; Emoji_Modifier_Base # 7.0 [1] (🖐️) hand with fingers splayed
|
||||
1F595..1F596 ; Emoji_Modifier_Base # 7.0 [2] (🖕..🖖) middle finger..vulcan salute
|
||||
1F645..1F647 ; Emoji_Modifier_Base # 6.0 [3] (🙅..🙇) person gesturing NO..person bowing
|
||||
1F64B..1F64F ; Emoji_Modifier_Base # 6.0 [5] (🙋..🙏) person raising hand..folded hands
|
||||
1F6A3 ; Emoji_Modifier_Base # 6.0 [1] (🚣) person rowing boat
|
||||
1F6B4..1F6B6 ; Emoji_Modifier_Base # 6.0 [3] (🚴..🚶) person biking..person walking
|
||||
1F6C0 ; Emoji_Modifier_Base # 6.0 [1] (🛀) person taking bath
|
||||
1F6CC ; Emoji_Modifier_Base # 7.0 [1] (🛌) person in bed
|
||||
1F918 ; Emoji_Modifier_Base # 8.0 [1] (🤘) sign of the horns
|
||||
1F919..1F91C ; Emoji_Modifier_Base # 9.0 [4] (🤙..🤜) call me hand..right-facing fist
|
||||
1F91E ; Emoji_Modifier_Base # 9.0 [1] (🤞) crossed fingers
|
||||
1F91F ; Emoji_Modifier_Base # 10.0 [1] (🤟) love-you gesture
|
||||
1F926 ; Emoji_Modifier_Base # 9.0 [1] (🤦) person facepalming
|
||||
1F930 ; Emoji_Modifier_Base # 9.0 [1] (🤰) pregnant woman
|
||||
1F931..1F932 ; Emoji_Modifier_Base # 10.0 [2] (🤱..🤲) breast-feeding..palms up together
|
||||
1F933..1F939 ; Emoji_Modifier_Base # 9.0 [7] (🤳..🤹) selfie..person juggling
|
||||
1F93D..1F93E ; Emoji_Modifier_Base # 9.0 [2] (🤽..🤾) person playing water polo..person playing handball
|
||||
1F9B5..1F9B6 ; Emoji_Modifier_Base # 11.0 [2] (🦵..🦶) leg..foot
|
||||
1F9B8..1F9B9 ; Emoji_Modifier_Base # 11.0 [2] (🦸..🦹) superhero..supervillain
|
||||
1F9D1..1F9DD ; Emoji_Modifier_Base # 10.0 [13] (🧑..🧝) adult..elf
|
||||
|
||||
# Total elements: 106
|
||||
|
||||
# ================================================
|
||||
|
||||
# All omitted code points have Emoji_Component=No
|
||||
# @missing: 0000..10FFFF ; Emoji_Component ; No
|
||||
|
||||
0023 ; Emoji_Component # 1.1 [1] (#️) number sign
|
||||
002A ; Emoji_Component # 1.1 [1] (*️) asterisk
|
||||
0030..0039 ; Emoji_Component # 1.1 [10] (0️..9️) digit zero..digit nine
|
||||
200D ; Emoji_Component # 1.1 [1] () zero width joiner
|
||||
20E3 ; Emoji_Component # 3.0 [1] (⃣) combining enclosing keycap
|
||||
FE0F ; Emoji_Component # 3.2 [1] () VARIATION SELECTOR-16
|
||||
1F1E6..1F1FF ; Emoji_Component # 6.0 [26] (🇦..🇿) regional indicator symbol letter a..regional indicator symbol letter z
|
||||
1F3FB..1F3FF ; Emoji_Component # 8.0 [5] (🏻..🏿) light skin tone..dark skin tone
|
||||
1F9B0..1F9B3 ; Emoji_Component # 11.0 [4] (🦰..🦳) red-haired..white-haired
|
||||
E0020..E007F ; Emoji_Component # 3.1 [96] (..) tag space..cancel tag
|
||||
|
||||
# Total elements: 146
|
||||
|
||||
# ================================================
|
||||
|
||||
# All omitted code points have Extended_Pictographic=No
|
||||
# @missing: 0000..10FFFF ; Extended_Pictographic ; No
|
||||
|
||||
00A9 ; Extended_Pictographic# 1.1 [1] (©️) copyright
|
||||
00AE ; Extended_Pictographic# 1.1 [1] (®️) registered
|
||||
203C ; Extended_Pictographic# 1.1 [1] (‼️) double exclamation mark
|
||||
2049 ; Extended_Pictographic# 3.0 [1] (⁉️) exclamation question mark
|
||||
2122 ; Extended_Pictographic# 1.1 [1] (™️) trade mark
|
||||
2139 ; Extended_Pictographic# 3.0 [1] (ℹ️) information
|
||||
2194..2199 ; Extended_Pictographic# 1.1 [6] (↔️..↙️) left-right arrow..down-left arrow
|
||||
21A9..21AA ; Extended_Pictographic# 1.1 [2] (↩️..↪️) right arrow curving left..left arrow curving right
|
||||
231A..231B ; Extended_Pictographic# 1.1 [2] (⌚..⌛) watch..hourglass done
|
||||
2328 ; Extended_Pictographic# 1.1 [1] (⌨️) keyboard
|
||||
2388 ; Extended_Pictographic# 3.0 [1] (⎈️) HELM SYMBOL
|
||||
23CF ; Extended_Pictographic# 4.0 [1] (⏏️) eject button
|
||||
23E9..23F3 ; Extended_Pictographic# 6.0 [11] (⏩..⏳) fast-forward button..hourglass not done
|
||||
23F8..23FA ; Extended_Pictographic# 7.0 [3] (⏸️..⏺️) pause button..record button
|
||||
24C2 ; Extended_Pictographic# 1.1 [1] (Ⓜ️) circled M
|
||||
25AA..25AB ; Extended_Pictographic# 1.1 [2] (▪️..▫️) black small square..white small square
|
||||
25B6 ; Extended_Pictographic# 1.1 [1] (▶️) play button
|
||||
25C0 ; Extended_Pictographic# 1.1 [1] (◀️) reverse button
|
||||
25FB..25FE ; Extended_Pictographic# 3.2 [4] (◻️..◾) white medium square..black medium-small square
|
||||
2600..2605 ; Extended_Pictographic# 1.1 [6] (☀️..★️) sun..BLACK STAR
|
||||
2607..2612 ; Extended_Pictographic# 1.1 [12] (☇️..☒️) LIGHTNING..BALLOT BOX WITH X
|
||||
2614..2615 ; Extended_Pictographic# 4.0 [2] (☔..☕) umbrella with rain drops..hot beverage
|
||||
2616..2617 ; Extended_Pictographic# 3.2 [2] (☖️..☗️) WHITE SHOGI PIECE..BLACK SHOGI PIECE
|
||||
2618 ; Extended_Pictographic# 4.1 [1] (☘️) shamrock
|
||||
2619 ; Extended_Pictographic# 3.0 [1] (☙️) REVERSED ROTATED FLORAL HEART BULLET
|
||||
261A..266F ; Extended_Pictographic# 1.1 [86] (☚️..♯️) BLACK LEFT POINTING INDEX..MUSIC SHARP SIGN
|
||||
2670..2671 ; Extended_Pictographic# 3.0 [2] (♰️..♱️) WEST SYRIAC CROSS..EAST SYRIAC CROSS
|
||||
2672..267D ; Extended_Pictographic# 3.2 [12] (♲️..♽️) UNIVERSAL RECYCLING SYMBOL..PARTIALLY-RECYCLED PAPER SYMBOL
|
||||
267E..267F ; Extended_Pictographic# 4.1 [2] (♾️..♿) infinity..wheelchair symbol
|
||||
2680..2685 ; Extended_Pictographic# 3.2 [6] (⚀️..⚅️) DIE FACE-1..DIE FACE-6
|
||||
2690..2691 ; Extended_Pictographic# 4.0 [2] (⚐️..⚑️) WHITE FLAG..BLACK FLAG
|
||||
2692..269C ; Extended_Pictographic# 4.1 [11] (⚒️..⚜️) hammer and pick..fleur-de-lis
|
||||
269D ; Extended_Pictographic# 5.1 [1] (⚝️) OUTLINED WHITE STAR
|
||||
269E..269F ; Extended_Pictographic# 5.2 [2] (⚞️..⚟️) THREE LINES CONVERGING RIGHT..THREE LINES CONVERGING LEFT
|
||||
26A0..26A1 ; Extended_Pictographic# 4.0 [2] (⚠️..⚡) warning..high voltage
|
||||
26A2..26B1 ; Extended_Pictographic# 4.1 [16] (⚢️..⚱️) DOUBLED FEMALE SIGN..funeral urn
|
||||
26B2 ; Extended_Pictographic# 5.0 [1] (⚲️) NEUTER
|
||||
26B3..26BC ; Extended_Pictographic# 5.1 [10] (⚳️..⚼️) CERES..SESQUIQUADRATE
|
||||
26BD..26BF ; Extended_Pictographic# 5.2 [3] (⚽..⚿️) soccer ball..SQUARED KEY
|
||||
26C0..26C3 ; Extended_Pictographic# 5.1 [4] (⛀️..⛃️) WHITE DRAUGHTS MAN..BLACK DRAUGHTS KING
|
||||
26C4..26CD ; Extended_Pictographic# 5.2 [10] (⛄..⛍️) snowman without snow..DISABLED CAR
|
||||
26CE ; Extended_Pictographic# 6.0 [1] (⛎) Ophiuchus
|
||||
26CF..26E1 ; Extended_Pictographic# 5.2 [19] (⛏️..⛡️) pick..RESTRICTED LEFT ENTRY-2
|
||||
26E2 ; Extended_Pictographic# 6.0 [1] (⛢️) ASTRONOMICAL SYMBOL FOR URANUS
|
||||
26E3 ; Extended_Pictographic# 5.2 [1] (⛣️) HEAVY CIRCLE WITH STROKE AND TWO DOTS ABOVE
|
||||
26E4..26E7 ; Extended_Pictographic# 6.0 [4] (⛤️..⛧️) PENTAGRAM..INVERTED PENTAGRAM
|
||||
26E8..26FF ; Extended_Pictographic# 5.2 [24] (⛨️..⛿️) BLACK CROSS ON SHIELD..WHITE FLAG WITH HORIZONTAL MIDDLE BLACK STRIPE
|
||||
2700 ; Extended_Pictographic# 7.0 [1] (✀️) BLACK SAFETY SCISSORS
|
||||
2701..2704 ; Extended_Pictographic# 1.1 [4] (✁️..✄️) UPPER BLADE SCISSORS..WHITE SCISSORS
|
||||
2705 ; Extended_Pictographic# 6.0 [1] (✅) white heavy check mark
|
||||
2708..2709 ; Extended_Pictographic# 1.1 [2] (✈️..✉️) airplane..envelope
|
||||
270A..270B ; Extended_Pictographic# 6.0 [2] (✊..✋) raised fist..raised hand
|
||||
270C..2712 ; Extended_Pictographic# 1.1 [7] (✌️..✒️) victory hand..black nib
|
||||
2714 ; Extended_Pictographic# 1.1 [1] (✔️) heavy check mark
|
||||
2716 ; Extended_Pictographic# 1.1 [1] (✖️) heavy multiplication x
|
||||
271D ; Extended_Pictographic# 1.1 [1] (✝️) latin cross
|
||||
2721 ; Extended_Pictographic# 1.1 [1] (✡️) star of David
|
||||
2728 ; Extended_Pictographic# 6.0 [1] (✨) sparkles
|
||||
2733..2734 ; Extended_Pictographic# 1.1 [2] (✳️..✴️) eight-spoked asterisk..eight-pointed star
|
||||
2744 ; Extended_Pictographic# 1.1 [1] (❄️) snowflake
|
||||
2747 ; Extended_Pictographic# 1.1 [1] (❇️) sparkle
|
||||
274C ; Extended_Pictographic# 6.0 [1] (❌) cross mark
|
||||
274E ; Extended_Pictographic# 6.0 [1] (❎) cross mark button
|
||||
2753..2755 ; Extended_Pictographic# 6.0 [3] (❓..❕) question mark..white exclamation mark
|
||||
2757 ; Extended_Pictographic# 5.2 [1] (❗) exclamation mark
|
||||
2763..2767 ; Extended_Pictographic# 1.1 [5] (❣️..❧️) heavy heart exclamation..ROTATED FLORAL HEART BULLET
|
||||
2795..2797 ; Extended_Pictographic# 6.0 [3] (➕..➗) heavy plus sign..heavy division sign
|
||||
27A1 ; Extended_Pictographic# 1.1 [1] (➡️) right arrow
|
||||
27B0 ; Extended_Pictographic# 6.0 [1] (➰) curly loop
|
||||
27BF ; Extended_Pictographic# 6.0 [1] (➿) double curly loop
|
||||
2934..2935 ; Extended_Pictographic# 3.2 [2] (⤴️..⤵️) right arrow curving up..right arrow curving down
|
||||
2B05..2B07 ; Extended_Pictographic# 4.0 [3] (⬅️..⬇️) left arrow..down arrow
|
||||
2B1B..2B1C ; Extended_Pictographic# 5.1 [2] (⬛..⬜) black large square..white large square
|
||||
2B50 ; Extended_Pictographic# 5.1 [1] (⭐) star
|
||||
2B55 ; Extended_Pictographic# 5.2 [1] (⭕) heavy large circle
|
||||
3030 ; Extended_Pictographic# 1.1 [1] (〰️) wavy dash
|
||||
303D ; Extended_Pictographic# 3.2 [1] (〽️) part alternation mark
|
||||
3297 ; Extended_Pictographic# 1.1 [1] (㊗️) Japanese “congratulations” button
|
||||
3299 ; Extended_Pictographic# 1.1 [1] (㊙️) Japanese “secret” button
|
||||
1F000..1F02B ; Extended_Pictographic# 5.1 [44] (🀀️..🀫️) MAHJONG TILE EAST WIND..MAHJONG TILE BACK
|
||||
1F02C..1F02F ; Extended_Pictographic# NA [4] (️..️) <reserved-1F02C>..<reserved-1F02F>
|
||||
1F030..1F093 ; Extended_Pictographic# 5.1[100] (🀰️..🂓️) DOMINO TILE HORIZONTAL BACK..DOMINO TILE VERTICAL-06-06
|
||||
1F094..1F09F ; Extended_Pictographic# NA [12] (️..️) <reserved-1F094>..<reserved-1F09F>
|
||||
1F0A0..1F0AE ; Extended_Pictographic# 6.0 [15] (🂠️..🂮️) PLAYING CARD BACK..PLAYING CARD KING OF SPADES
|
||||
1F0AF..1F0B0 ; Extended_Pictographic# NA [2] (️..️) <reserved-1F0AF>..<reserved-1F0B0>
|
||||
1F0B1..1F0BE ; Extended_Pictographic# 6.0 [14] (🂱️..🂾️) PLAYING CARD ACE OF HEARTS..PLAYING CARD KING OF HEARTS
|
||||
1F0BF ; Extended_Pictographic# 7.0 [1] (🂿️) PLAYING CARD RED JOKER
|
||||
1F0C0 ; Extended_Pictographic# NA [1] (️) <reserved-1F0C0>
|
||||
1F0C1..1F0CF ; Extended_Pictographic# 6.0 [15] (🃁️..🃏) PLAYING CARD ACE OF DIAMONDS..joker
|
||||
1F0D0 ; Extended_Pictographic# NA [1] (️) <reserved-1F0D0>
|
||||
1F0D1..1F0DF ; Extended_Pictographic# 6.0 [15] (🃑️..🃟️) PLAYING CARD ACE OF CLUBS..PLAYING CARD WHITE JOKER
|
||||
1F0E0..1F0F5 ; Extended_Pictographic# 7.0 [22] (🃠️..🃵️) PLAYING CARD FOOL..PLAYING CARD TRUMP-21
|
||||
1F0F6..1F0FF ; Extended_Pictographic# NA [10] (️..️) <reserved-1F0F6>..<reserved-1F0FF>
|
||||
1F10D..1F10F ; Extended_Pictographic# NA [3] (🄍️..🄏️) <reserved-1F10D>..<reserved-1F10F>
|
||||
1F12F ; Extended_Pictographic# 11.0 [1] (🄯️) COPYLEFT SYMBOL
|
||||
1F16C..1F16F ; Extended_Pictographic# NA [4] (🅬️..🅯️) <reserved-1F16C>..<reserved-1F16F>
|
||||
1F170..1F171 ; Extended_Pictographic# 6.0 [2] (🅰️..🅱️) A button (blood type)..B button (blood type)
|
||||
1F17E ; Extended_Pictographic# 6.0 [1] (🅾️) O button (blood type)
|
||||
1F17F ; Extended_Pictographic# 5.2 [1] (🅿️) P button
|
||||
1F18E ; Extended_Pictographic# 6.0 [1] (🆎) AB button (blood type)
|
||||
1F191..1F19A ; Extended_Pictographic# 6.0 [10] (🆑..🆚) CL button..VS button
|
||||
1F1AD..1F1E5 ; Extended_Pictographic# NA [57] (🆭️..️) <reserved-1F1AD>..<reserved-1F1E5>
|
||||
1F201..1F202 ; Extended_Pictographic# 6.0 [2] (🈁..🈂️) Japanese “here” button..Japanese “service charge” button
|
||||
1F203..1F20F ; Extended_Pictographic# NA [13] (️..️) <reserved-1F203>..<reserved-1F20F>
|
||||
1F21A ; Extended_Pictographic# 5.2 [1] (🈚) Japanese “free of charge” button
|
||||
1F22F ; Extended_Pictographic# 5.2 [1] (🈯) Japanese “reserved” button
|
||||
1F232..1F23A ; Extended_Pictographic# 6.0 [9] (🈲..🈺) Japanese “prohibited” button..Japanese “open for business” button
|
||||
1F23C..1F23F ; Extended_Pictographic# NA [4] (️..️) <reserved-1F23C>..<reserved-1F23F>
|
||||
1F249..1F24F ; Extended_Pictographic# NA [7] (️..️) <reserved-1F249>..<reserved-1F24F>
|
||||
1F250..1F251 ; Extended_Pictographic# 6.0 [2] (🉐..🉑) Japanese “bargain” button..Japanese “acceptable” button
|
||||
1F252..1F25F ; Extended_Pictographic# NA [14] (️..️) <reserved-1F252>..<reserved-1F25F>
|
||||
1F260..1F265 ; Extended_Pictographic# 10.0 [6] (🉠️..🉥️) ROUNDED SYMBOL FOR FU..ROUNDED SYMBOL FOR CAI
|
||||
1F266..1F2FF ; Extended_Pictographic# NA[154] (️..️) <reserved-1F266>..<reserved-1F2FF>
|
||||
1F300..1F320 ; Extended_Pictographic# 6.0 [33] (🌀..🌠) cyclone..shooting star
|
||||
1F321..1F32C ; Extended_Pictographic# 7.0 [12] (🌡️..🌬️) thermometer..wind face
|
||||
1F32D..1F32F ; Extended_Pictographic# 8.0 [3] (🌭..🌯) hot dog..burrito
|
||||
1F330..1F335 ; Extended_Pictographic# 6.0 [6] (🌰..🌵) chestnut..cactus
|
||||
1F336 ; Extended_Pictographic# 7.0 [1] (🌶️) hot pepper
|
||||
1F337..1F37C ; Extended_Pictographic# 6.0 [70] (🌷..🍼) tulip..baby bottle
|
||||
1F37D ; Extended_Pictographic# 7.0 [1] (🍽️) fork and knife with plate
|
||||
1F37E..1F37F ; Extended_Pictographic# 8.0 [2] (🍾..🍿) bottle with popping cork..popcorn
|
||||
1F380..1F393 ; Extended_Pictographic# 6.0 [20] (🎀..🎓) ribbon..graduation cap
|
||||
1F394..1F39F ; Extended_Pictographic# 7.0 [12] (🎔️..🎟️) HEART WITH TIP ON THE LEFT..admission tickets
|
||||
1F3A0..1F3C4 ; Extended_Pictographic# 6.0 [37] (🎠..🏄) carousel horse..person surfing
|
||||
1F3C5 ; Extended_Pictographic# 7.0 [1] (🏅) sports medal
|
||||
1F3C6..1F3CA ; Extended_Pictographic# 6.0 [5] (🏆..🏊) trophy..person swimming
|
||||
1F3CB..1F3CE ; Extended_Pictographic# 7.0 [4] (🏋️..🏎️) person lifting weights..racing car
|
||||
1F3CF..1F3D3 ; Extended_Pictographic# 8.0 [5] (🏏..🏓) cricket game..ping pong
|
||||
1F3D4..1F3DF ; Extended_Pictographic# 7.0 [12] (🏔️..🏟️) snow-capped mountain..stadium
|
||||
1F3E0..1F3F0 ; Extended_Pictographic# 6.0 [17] (🏠..🏰) house..castle
|
||||
1F3F1..1F3F7 ; Extended_Pictographic# 7.0 [7] (🏱️..🏷️) WHITE PENNANT..label
|
||||
1F3F8..1F3FA ; Extended_Pictographic# 8.0 [3] (🏸..🏺) badminton..amphora
|
||||
1F400..1F43E ; Extended_Pictographic# 6.0 [63] (🐀..🐾) rat..paw prints
|
||||
1F43F ; Extended_Pictographic# 7.0 [1] (🐿️) chipmunk
|
||||
1F440 ; Extended_Pictographic# 6.0 [1] (👀) eyes
|
||||
1F441 ; Extended_Pictographic# 7.0 [1] (👁️) eye
|
||||
1F442..1F4F7 ; Extended_Pictographic# 6.0[182] (👂..📷) ear..camera
|
||||
1F4F8 ; Extended_Pictographic# 7.0 [1] (📸) camera with flash
|
||||
1F4F9..1F4FC ; Extended_Pictographic# 6.0 [4] (📹..📼) video camera..videocassette
|
||||
1F4FD..1F4FE ; Extended_Pictographic# 7.0 [2] (📽️..📾️) film projector..PORTABLE STEREO
|
||||
1F4FF ; Extended_Pictographic# 8.0 [1] (📿) prayer beads
|
||||
1F500..1F53D ; Extended_Pictographic# 6.0 [62] (🔀..🔽) shuffle tracks button..downwards button
|
||||
1F546..1F54A ; Extended_Pictographic# 7.0 [5] (🕆️..🕊️) WHITE LATIN CROSS..dove
|
||||
1F54B..1F54F ; Extended_Pictographic# 8.0 [5] (🕋..🕏️) kaaba..BOWL OF HYGIEIA
|
||||
1F550..1F567 ; Extended_Pictographic# 6.0 [24] (🕐..🕧) one o’clock..twelve-thirty
|
||||
1F568..1F579 ; Extended_Pictographic# 7.0 [18] (🕨️..🕹️) RIGHT SPEAKER..joystick
|
||||
1F57A ; Extended_Pictographic# 9.0 [1] (🕺) man dancing
|
||||
1F57B..1F5A3 ; Extended_Pictographic# 7.0 [41] (🕻️..🖣️) LEFT HAND TELEPHONE RECEIVER..BLACK DOWN POINTING BACKHAND INDEX
|
||||
1F5A4 ; Extended_Pictographic# 9.0 [1] (🖤) black heart
|
||||
1F5A5..1F5FA ; Extended_Pictographic# 7.0 [86] (🖥️..🗺️) desktop computer..world map
|
||||
1F5FB..1F5FF ; Extended_Pictographic# 6.0 [5] (🗻..🗿) mount fuji..moai
|
||||
1F600 ; Extended_Pictographic# 6.1 [1] (😀) grinning face
|
||||
1F601..1F610 ; Extended_Pictographic# 6.0 [16] (😁..😐) beaming face with smiling eyes..neutral face
|
||||
1F611 ; Extended_Pictographic# 6.1 [1] (😑) expressionless face
|
||||
1F612..1F614 ; Extended_Pictographic# 6.0 [3] (😒..😔) unamused face..pensive face
|
||||
1F615 ; Extended_Pictographic# 6.1 [1] (😕) confused face
|
||||
1F616 ; Extended_Pictographic# 6.0 [1] (😖) confounded face
|
||||
1F617 ; Extended_Pictographic# 6.1 [1] (😗) kissing face
|
||||
1F618 ; Extended_Pictographic# 6.0 [1] (😘) face blowing a kiss
|
||||
1F619 ; Extended_Pictographic# 6.1 [1] (😙) kissing face with smiling eyes
|
||||
1F61A ; Extended_Pictographic# 6.0 [1] (😚) kissing face with closed eyes
|
||||
1F61B ; Extended_Pictographic# 6.1 [1] (😛) face with tongue
|
||||
1F61C..1F61E ; Extended_Pictographic# 6.0 [3] (😜..😞) winking face with tongue..disappointed face
|
||||
1F61F ; Extended_Pictographic# 6.1 [1] (😟) worried face
|
||||
1F620..1F625 ; Extended_Pictographic# 6.0 [6] (😠..😥) angry face..sad but relieved face
|
||||
1F626..1F627 ; Extended_Pictographic# 6.1 [2] (😦..😧) frowning face with open mouth..anguished face
|
||||
1F628..1F62B ; Extended_Pictographic# 6.0 [4] (😨..😫) fearful face..tired face
|
||||
1F62C ; Extended_Pictographic# 6.1 [1] (😬) grimacing face
|
||||
1F62D ; Extended_Pictographic# 6.0 [1] (😭) loudly crying face
|
||||
1F62E..1F62F ; Extended_Pictographic# 6.1 [2] (😮..😯) face with open mouth..hushed face
|
||||
1F630..1F633 ; Extended_Pictographic# 6.0 [4] (😰..😳) anxious face with sweat..flushed face
|
||||
1F634 ; Extended_Pictographic# 6.1 [1] (😴) sleeping face
|
||||
1F635..1F640 ; Extended_Pictographic# 6.0 [12] (😵..🙀) dizzy face..weary cat face
|
||||
1F641..1F642 ; Extended_Pictographic# 7.0 [2] (🙁..🙂) slightly frowning face..slightly smiling face
|
||||
1F643..1F644 ; Extended_Pictographic# 8.0 [2] (🙃..🙄) upside-down face..face with rolling eyes
|
||||
1F645..1F64F ; Extended_Pictographic# 6.0 [11] (🙅..🙏) person gesturing NO..folded hands
|
||||
1F680..1F6C5 ; Extended_Pictographic# 6.0 [70] (🚀..🛅) rocket..left luggage
|
||||
1F6C6..1F6CF ; Extended_Pictographic# 7.0 [10] (🛆️..🛏️) TRIANGLE WITH ROUNDED CORNERS..bed
|
||||
1F6D0 ; Extended_Pictographic# 8.0 [1] (🛐) place of worship
|
||||
1F6D1..1F6D2 ; Extended_Pictographic# 9.0 [2] (🛑..🛒) stop sign..shopping cart
|
||||
1F6D3..1F6D4 ; Extended_Pictographic# 10.0 [2] (🛓️..🛔️) STUPA..PAGODA
|
||||
1F6D5..1F6DF ; Extended_Pictographic# NA [11] (🛕️..🛟️) <reserved-1F6D5>..<reserved-1F6DF>
|
||||
1F6E0..1F6EC ; Extended_Pictographic# 7.0 [13] (🛠️..🛬) hammer and wrench..airplane arrival
|
||||
1F6ED..1F6EF ; Extended_Pictographic# NA [3] (️..️) <reserved-1F6ED>..<reserved-1F6EF>
|
||||
1F6F0..1F6F3 ; Extended_Pictographic# 7.0 [4] (🛰️..🛳️) satellite..passenger ship
|
||||
1F6F4..1F6F6 ; Extended_Pictographic# 9.0 [3] (🛴..🛶) kick scooter..canoe
|
||||
1F6F7..1F6F8 ; Extended_Pictographic# 10.0 [2] (🛷..🛸) sled..flying saucer
|
||||
1F6F9 ; Extended_Pictographic# 11.0 [1] (🛹) skateboard
|
||||
1F6FA..1F6FF ; Extended_Pictographic# NA [6] (🛺️..️) <reserved-1F6FA>..<reserved-1F6FF>
|
||||
1F774..1F77F ; Extended_Pictographic# NA [12] (🝴️..🝿️) <reserved-1F774>..<reserved-1F77F>
|
||||
1F7D5..1F7D8 ; Extended_Pictographic# 11.0 [4] (🟕️..🟘️) CIRCLED TRIANGLE..NEGATIVE CIRCLED SQUARE
|
||||
1F7D9..1F7FF ; Extended_Pictographic# NA [39] (🟙️..️) <reserved-1F7D9>..<reserved-1F7FF>
|
||||
1F80C..1F80F ; Extended_Pictographic# NA [4] (️..️) <reserved-1F80C>..<reserved-1F80F>
|
||||
1F848..1F84F ; Extended_Pictographic# NA [8] (️..️) <reserved-1F848>..<reserved-1F84F>
|
||||
1F85A..1F85F ; Extended_Pictographic# NA [6] (️..️) <reserved-1F85A>..<reserved-1F85F>
|
||||
1F888..1F88F ; Extended_Pictographic# NA [8] (️..️) <reserved-1F888>..<reserved-1F88F>
|
||||
1F8AE..1F8FF ; Extended_Pictographic# NA [82] (️..️) <reserved-1F8AE>..<reserved-1F8FF>
|
||||
1F90C..1F90F ; Extended_Pictographic# NA [4] (🤌️..🤏️) <reserved-1F90C>..<reserved-1F90F>
|
||||
1F910..1F918 ; Extended_Pictographic# 8.0 [9] (🤐..🤘) zipper-mouth face..sign of the horns
|
||||
1F919..1F91E ; Extended_Pictographic# 9.0 [6] (🤙..🤞) call me hand..crossed fingers
|
||||
1F91F ; Extended_Pictographic# 10.0 [1] (🤟) love-you gesture
|
||||
1F920..1F927 ; Extended_Pictographic# 9.0 [8] (🤠..🤧) cowboy hat face..sneezing face
|
||||
1F928..1F92F ; Extended_Pictographic# 10.0 [8] (🤨..🤯) face with raised eyebrow..exploding head
|
||||
1F930 ; Extended_Pictographic# 9.0 [1] (🤰) pregnant woman
|
||||
1F931..1F932 ; Extended_Pictographic# 10.0 [2] (🤱..🤲) breast-feeding..palms up together
|
||||
1F933..1F93A ; Extended_Pictographic# 9.0 [8] (🤳..🤺) selfie..person fencing
|
||||
1F93C..1F93E ; Extended_Pictographic# 9.0 [3] (🤼..🤾) people wrestling..person playing handball
|
||||
1F93F ; Extended_Pictographic# NA [1] (🤿️) <reserved-1F93F>
|
||||
1F940..1F945 ; Extended_Pictographic# 9.0 [6] (🥀..🥅) wilted flower..goal net
|
||||
1F947..1F94B ; Extended_Pictographic# 9.0 [5] (🥇..🥋) 1st place medal..martial arts uniform
|
||||
1F94C ; Extended_Pictographic# 10.0 [1] (🥌) curling stone
|
||||
1F94D..1F94F ; Extended_Pictographic# 11.0 [3] (🥍..🥏) lacrosse..flying disc
|
||||
1F950..1F95E ; Extended_Pictographic# 9.0 [15] (🥐..🥞) croissant..pancakes
|
||||
1F95F..1F96B ; Extended_Pictographic# 10.0 [13] (🥟..🥫) dumpling..canned food
|
||||
1F96C..1F970 ; Extended_Pictographic# 11.0 [5] (🥬..🥰) leafy green..smiling face with 3 hearts
|
||||
1F971..1F972 ; Extended_Pictographic# NA [2] (🥱️..🥲️) <reserved-1F971>..<reserved-1F972>
|
||||
1F973..1F976 ; Extended_Pictographic# 11.0 [4] (🥳..🥶) partying face..cold face
|
||||
1F977..1F979 ; Extended_Pictographic# NA [3] (🥷️..🥹️) <reserved-1F977>..<reserved-1F979>
|
||||
1F97A ; Extended_Pictographic# 11.0 [1] (🥺) pleading face
|
||||
1F97B ; Extended_Pictographic# NA [1] (🥻️) <reserved-1F97B>
|
||||
1F97C..1F97F ; Extended_Pictographic# 11.0 [4] (🥼..🥿) lab coat..woman’s flat shoe
|
||||
1F980..1F984 ; Extended_Pictographic# 8.0 [5] (🦀..🦄) crab..unicorn face
|
||||
1F985..1F991 ; Extended_Pictographic# 9.0 [13] (🦅..🦑) eagle..squid
|
||||
1F992..1F997 ; Extended_Pictographic# 10.0 [6] (🦒..🦗) giraffe..cricket
|
||||
1F998..1F9A2 ; Extended_Pictographic# 11.0 [11] (🦘..🦢) kangaroo..swan
|
||||
1F9A3..1F9AF ; Extended_Pictographic# NA [13] (🦣️..🦯️) <reserved-1F9A3>..<reserved-1F9AF>
|
||||
1F9B0..1F9B9 ; Extended_Pictographic# 11.0 [10] (🦰..🦹) red-haired..supervillain
|
||||
1F9BA..1F9BF ; Extended_Pictographic# NA [6] (🦺️..🦿️) <reserved-1F9BA>..<reserved-1F9BF>
|
||||
1F9C0 ; Extended_Pictographic# 8.0 [1] (🧀) cheese wedge
|
||||
1F9C1..1F9C2 ; Extended_Pictographic# 11.0 [2] (🧁..🧂) cupcake..salt
|
||||
1F9C3..1F9CF ; Extended_Pictographic# NA [13] (🧃️..🧏️) <reserved-1F9C3>..<reserved-1F9CF>
|
||||
1F9D0..1F9E6 ; Extended_Pictographic# 10.0 [23] (🧐..🧦) face with monocle..socks
|
||||
1F9E7..1F9FF ; Extended_Pictographic# 11.0 [25] (🧧..🧿) red envelope..nazar amulet
|
||||
1FA00..1FA5F ; Extended_Pictographic# NA [96] (🨀️..️) <reserved-1FA00>..<reserved-1FA5F>
|
||||
1FA60..1FA6D ; Extended_Pictographic# 11.0 [14] (🩠️..🩭️) XIANGQI RED GENERAL..XIANGQI BLACK SOLDIER
|
||||
1FA6E..1FFFD ; Extended_Pictographic# NA[1424] (️..️) <reserved-1FA6E>..<reserved-1FFFD>
|
||||
|
||||
# Total elements: 3793
|
||||
|
||||
#EOF
|
|
@ -2,7 +2,7 @@
|
|||
* A program for testing the Unicode property table *
|
||||
***************************************************/
|
||||
|
||||
/* Copyright (c) University of Cambridge 2008 - 2014 */
|
||||
/* Copyright (c) University of Cambridge 2008 - 2018 */
|
||||
|
||||
/* Compile thus:
|
||||
gcc -DHAVE_CONFIG_H -DPCRE2_CODE_UNIT_WIDTH=8 -o ucptest \
|
||||
|
@ -123,7 +123,13 @@ switch(gbprop)
|
|||
case ucp_gbT: graphbreak = US"Hangul syllable type T"; break;
|
||||
case ucp_gbLV: graphbreak = US"Hangul syllable type LV"; break;
|
||||
case ucp_gbLVT: graphbreak = US"Hangul syllable type LVT"; break;
|
||||
case ucp_gbRegionalIndicator:
|
||||
graphbreak = US"Regional Indicator"; break;
|
||||
case ucp_gbOther: graphbreak = US"Other"; break;
|
||||
case ucp_gbZWJ: graphbreak = US"Zero Width Joiner"; break;
|
||||
case ucp_gbExtended_Pictographic:
|
||||
graphbreak = US"Extended Pictographic"; break;
|
||||
default: graphbreak = US"Unknown"; break;
|
||||
}
|
||||
|
||||
switch(script)
|
||||
|
@ -268,6 +274,27 @@ switch(script)
|
|||
case ucp_Multani: scriptname = US"Multani"; break;
|
||||
case ucp_Old_Hungarian: scriptname = US"Old_Hungarian"; break;
|
||||
case ucp_SignWriting: scriptname = US"SignWriting"; break;
|
||||
|
||||
/* New for Unicode 10.0.0 (no update since 8.0.0) */
|
||||
case ucp_Adlam: scriptname = US"Adlam"; break;
|
||||
case ucp_Bhaiksuki: scriptname = US"Bhaiksuki"; break;
|
||||
case ucp_Marchen: scriptname = US"Marchen"; break;
|
||||
case ucp_Newa: scriptname = US"Newa"; break;
|
||||
case ucp_Osage: scriptname = US"Osage"; break;
|
||||
case ucp_Tangut: scriptname = US"Tangut"; break;
|
||||
case ucp_Masaram_Gondi: scriptname = US"Masaram_Gondi"; break;
|
||||
case ucp_Nushu: scriptname = US"Nushu"; break;
|
||||
case ucp_Soyombo: scriptname = US"Soyombo"; break;
|
||||
case ucp_Zanabazar_Square: scriptname = US"Zanabazar_Square"; break;
|
||||
|
||||
/* New for Unicode 11.0.0 */
|
||||
case ucp_Dogra: scriptname = US"Dogra"; break;
|
||||
case ucp_Gunjala_Gondi: scriptname = US"Gunjala_Gondi"; break;
|
||||
case ucp_Hanifi_Rohingya: scriptname = US"Hanifi_Rohingya"; break;
|
||||
case ucp_Makasar: scriptname = US"Makasar"; break;
|
||||
case ucp_Medefaidrin: scriptname = US"Medefaidrin"; break;
|
||||
case ucp_Old_Sogdian: scriptname = US"Old_Sogdian"; break;
|
||||
case ucp_Sogdian: scriptname = US"Sogdian"; break;
|
||||
}
|
||||
|
||||
printf("%04x %s: %s, %s, %s", c, typename, fulltypename, scriptname, graphbreak);
|
||||
|
|
|
@ -36,3 +36,5 @@ findprop 0d 0a 0e 0711 1b04 1111 1169 11fe ae4c ad89
|
|||
findprop 118a0 11ac7 16ad0
|
||||
|
||||
findprop 11700 14400 108e0 11280 1d800
|
||||
|
||||
findprop 11800 1e903 11da9 10d27 11ee0 16e48 10f27 10f30
|
||||
|
|
|
@ -179,12 +179,12 @@ findprop a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 aa ab ac ad ae af
|
|||
00a6 Symbol: Other symbol, Common, Other
|
||||
00a7 Punctuation: Other punctuation, Common, Other
|
||||
00a8 Symbol: Modifier symbol, Common, Other
|
||||
00a9 Symbol: Other symbol, Common, Other
|
||||
00a9 Symbol: Other symbol, Common, Extended Pictographic
|
||||
00aa Letter: Other letter, Latin, Other
|
||||
00ab Punctuation: Initial punctuation, Common, Other
|
||||
00ac Symbol: Mathematical symbol, Common, Other
|
||||
00ad Control: Format, Common, Control
|
||||
00ae Symbol: Other symbol, Common, Other
|
||||
00ae Symbol: Other symbol, Common, Extended Pictographic
|
||||
00af Symbol: Modifier symbol, Common, Other
|
||||
findprop b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 ba bb bc bd be bf
|
||||
00b0 Symbol: Other symbol, Common, Other
|
||||
|
@ -369,3 +369,13 @@ findprop 11700 14400 108e0 11280 1d800
|
|||
108e0 Letter: Other letter, Hatran, Other
|
||||
11280 Letter: Other letter, Multani, Other
|
||||
1d800 Symbol: Other symbol, SignWriting, Other
|
||||
|
||||
findprop 11800 1e903 11da9 10d27 11ee0 16e48 10f27 10f30
|
||||
11800 Letter: Other letter, Dogra, Other
|
||||
1e903 Letter: Upper case letter, Adlam, Other, 1e925
|
||||
11da9 Number: Decimal number, Gunjala_Gondi, Other
|
||||
10d27 Mark: Non-spacing mark, Hanifi_Rohingya, Extend
|
||||
11ee0 Letter: Other letter, Makasar, Other
|
||||
16e48 Letter: Upper case letter, Medefaidrin, Other, 16e68
|
||||
10f27 Letter: Other letter, Old_Sogdian, Other
|
||||
10f30 Letter: Other letter, Sogdian, Other
|
||||
|
|
|
@ -129,11 +129,11 @@ while (eptr < end_subject)
|
|||
if ((ricount & 1) != 0) break; /* Grapheme break required */
|
||||
}
|
||||
|
||||
/* If Extend follows E_Base[_GAZ] do not update lgb; this allows
|
||||
any number of Extend before a following E_Modifier. */
|
||||
/* If Extend or ZWJ follows Extended_Pictographic, do not update lgb; this
|
||||
allows any number of them before a following Extended_Pictographic. */
|
||||
|
||||
if (rgb != ucp_gbExtend ||
|
||||
(lgb != ucp_gbE_Base && lgb != ucp_gbE_Base_GAZ))
|
||||
if ((rgb != ucp_gbExtend && rgb != ucp_gbZWJ) ||
|
||||
lgb != ucp_gbExtended_Pictographic)
|
||||
lgb = rgb;
|
||||
|
||||
eptr += len;
|
||||
|
|
|
@ -1901,7 +1901,7 @@ extern const ucd_record PRIV(ucd_records)[];
|
|||
#if PCRE2_CODE_UNIT_WIDTH == 32
|
||||
extern const ucd_record PRIV(dummy_ucd_record)[];
|
||||
#endif
|
||||
extern const uint8_t PRIV(ucd_stage1)[];
|
||||
extern const uint16_t PRIV(ucd_stage1)[];
|
||||
extern const uint16_t PRIV(ucd_stage2)[];
|
||||
extern const uint32_t PRIV(ucp_gbtable)[];
|
||||
extern const uint32_t PRIV(ucp_gentype)[];
|
||||
|
|
|
@ -3666,7 +3666,8 @@ if (!common->utf)
|
|||
#endif
|
||||
|
||||
OP2(SLJIT_LSHR, TMP2, 0, TMP1, 0, SLJIT_IMM, UCD_BLOCK_SHIFT);
|
||||
OP1(SLJIT_MOV_U8, TMP2, 0, SLJIT_MEM1(TMP2), (sljit_sw)PRIV(ucd_stage1));
|
||||
OP2(SLJIT_ADD, TMP2, 0, TMP2, 0, TMP2, 0);
|
||||
OP1(SLJIT_MOV_U16, TMP2, 0, SLJIT_MEM1(TMP2), (sljit_sw)PRIV(ucd_stage1));
|
||||
OP2(SLJIT_AND, TMP1, 0, TMP1, 0, SLJIT_IMM, UCD_BLOCK_MASK);
|
||||
OP2(SLJIT_SHL, TMP2, 0, TMP2, 0, SLJIT_IMM, UCD_BLOCK_SHIFT);
|
||||
OP2(SLJIT_ADD, TMP1, 0, TMP1, 0, TMP2, 0);
|
||||
|
@ -6627,7 +6628,8 @@ if (needstype || needsscript)
|
|||
#endif
|
||||
|
||||
OP2(SLJIT_LSHR, TMP2, 0, TMP1, 0, SLJIT_IMM, UCD_BLOCK_SHIFT);
|
||||
OP1(SLJIT_MOV_U8, TMP2, 0, SLJIT_MEM1(TMP2), (sljit_sw)PRIV(ucd_stage1));
|
||||
OP2(SLJIT_ADD, TMP2, 0, TMP2, 0, TMP2, 0);
|
||||
OP1(SLJIT_MOV_U16, TMP2, 0, SLJIT_MEM1(TMP2), (sljit_sw)PRIV(ucd_stage1));
|
||||
OP2(SLJIT_AND, TMP1, 0, TMP1, 0, SLJIT_IMM, UCD_BLOCK_MASK);
|
||||
OP2(SLJIT_SHL, TMP2, 0, TMP2, 0, SLJIT_IMM, UCD_BLOCK_SHIFT);
|
||||
OP2(SLJIT_ADD, TMP1, 0, TMP1, 0, TMP2, 0);
|
||||
|
@ -7254,10 +7256,11 @@ while (cc < end_subject)
|
|||
if ((ricount & 1) != 0) break; /* Grapheme break required */
|
||||
}
|
||||
|
||||
/* If Extend follows E_Base[_GAZ] do not update lgb; this allows
|
||||
any number of Extend before a following E_Modifier. */
|
||||
/* If Extend or ZWJ follows Extended_Pictographic, do not update lgb; this
|
||||
allows any number of them before a following Extended_Pictographic. */
|
||||
|
||||
if (rgb != ucp_gbExtend || (lgb != ucp_gbE_Base && lgb != ucp_gbE_Base_GAZ))
|
||||
if ((rgb != ucp_gbExtend && rgb != ucp_gbZWJ) ||
|
||||
lgb != ucp_gbExtended_Pictographic)
|
||||
lgb = rgb;
|
||||
|
||||
prevcc = cc;
|
||||
|
@ -7309,10 +7312,11 @@ while (cc < end_subject)
|
|||
if ((ricount & 1) != 0) break; /* Grapheme break required */
|
||||
}
|
||||
|
||||
/* If Extend follows E_Base[_GAZ] do not update lgb; this allows
|
||||
any number of Extend before a following E_Modifier. */
|
||||
/* If Extend or ZWJ follows Extended_Pictographic, do not update lgb; this
|
||||
allows any number of them before a following Extended_Pictographic. */
|
||||
|
||||
if (rgb != ucp_gbExtend || (lgb != ucp_gbE_Base && lgb != ucp_gbE_Base_GAZ))
|
||||
if ((rgb != ucp_gbExtend && rgb != ucp_gbZWJ) ||
|
||||
lgb != ucp_gbExtended_Pictographic)
|
||||
lgb = rgb;
|
||||
|
||||
cc++;
|
||||
|
|
|
@ -7,7 +7,7 @@ and semantics are as close as possible to those of the Perl 5 language.
|
|||
|
||||
Written by Philip Hazel
|
||||
Original API code Copyright (c) 1997-2012 University of Cambridge
|
||||
New API code Copyright (c) 2016-2017 University of Cambridge
|
||||
New API code Copyright (c) 2016-2018 University of Cambridge
|
||||
|
||||
-----------------------------------------------------------------------------
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
|
@ -137,9 +137,10 @@ const uint32_t PRIV(ucp_gentype)[] = {
|
|||
|
||||
/* This table encodes the rules for finding the end of an extended grapheme
|
||||
cluster. Every code point has a grapheme break property which is one of the
|
||||
ucp_gbXX values defined in pcre2_ucp.h. The 2-dimensional table is indexed by
|
||||
the properties of two adjacent code points. The left property selects a word
|
||||
from the table, and the right property selects a bit from that word like this:
|
||||
ucp_gbXX values defined in pcre2_ucp.h. These changed between Unicode versions
|
||||
10 and 11. The 2-dimensional table is indexed by the properties of two adjacent
|
||||
code points. The left property selects a word from the table, and the right
|
||||
property selects a bit from that word like this:
|
||||
|
||||
PRIV(ucp_gbtable)[left-property] & (1 << right-property)
|
||||
|
||||
|
@ -166,49 +167,41 @@ are implementing).
|
|||
|
||||
6. Do not break after Prepend characters.
|
||||
|
||||
7. Do not break within emoji modifier sequences (E_Base or E_Base_GAZ followed
|
||||
by E_Modifier). Extend characters are allowed before the modifier; this
|
||||
cannot be represented in this table, the code has to deal with it.
|
||||
7. Do not break within emoji modifier sequences or emoji zwj sequences. That
|
||||
is, do not break between characters with the Extended_Pictographic property.
|
||||
Extend and ZWJ characters are allowed between the characters; this cannot be
|
||||
represented in this table, the code has to deal with it.
|
||||
|
||||
8. Do not break within emoji zwj sequences (ZWJ followed by Glue_After_Zwj or
|
||||
E_Base_GAZ).
|
||||
|
||||
9. Do not break within emoji flag sequences. That is, do not break between
|
||||
8. Do not break within emoji flag sequences. That is, do not break between
|
||||
regional indicator (RI) symbols if there are an odd number of RI characters
|
||||
before the break point. This table encodes "join RI characters"; the code
|
||||
has to deal with checking for previous adjoining RIs.
|
||||
|
||||
10. Otherwise, break everywhere.
|
||||
9. Otherwise, break everywhere.
|
||||
*/
|
||||
|
||||
#define ESZ (1<<ucp_gbExtend)|(1<<ucp_gbSpacingMark)|(1<<ucp_gbZWJ)
|
||||
|
||||
const uint32_t PRIV(ucp_gbtable)[] = {
|
||||
(1<<ucp_gbLF), /* 0 CR */
|
||||
0, /* 1 LF */
|
||||
0, /* 2 Control */
|
||||
ESZ, /* 3 Extend */
|
||||
ESZ|(1<<ucp_gbPrepend)| /* 4 Prepend */
|
||||
(1<<ucp_gbLF), /* 0 CR */
|
||||
0, /* 1 LF */
|
||||
0, /* 2 Control */
|
||||
ESZ, /* 3 Extend */
|
||||
ESZ|(1<<ucp_gbPrepend)| /* 4 Prepend */
|
||||
(1<<ucp_gbL)|(1<<ucp_gbV)|(1<<ucp_gbT)|
|
||||
(1<<ucp_gbLV)|(1<<ucp_gbLVT)|(1<<ucp_gbOther)|
|
||||
(1<<ucp_gbRegionalIndicator)|
|
||||
(1<<ucp_gbE_Base)|(1<<ucp_gbE_Modifier)|
|
||||
(1<<ucp_gbE_Base_GAZ)|
|
||||
(1<<ucp_gbZWJ)|(1<<ucp_gbGlue_After_Zwj),
|
||||
ESZ, /* 5 SpacingMark */
|
||||
ESZ|(1<<ucp_gbL)|(1<<ucp_gbV)|(1<<ucp_gbLV)| /* 6 L */
|
||||
(1<<ucp_gbRegionalIndicator),
|
||||
ESZ, /* 5 SpacingMark */
|
||||
ESZ|(1<<ucp_gbL)|(1<<ucp_gbV)|(1<<ucp_gbLV)| /* 6 L */
|
||||
(1<<ucp_gbLVT),
|
||||
ESZ|(1<<ucp_gbV)|(1<<ucp_gbT), /* 7 V */
|
||||
ESZ|(1<<ucp_gbT), /* 8 T */
|
||||
ESZ|(1<<ucp_gbV)|(1<<ucp_gbT), /* 9 LV */
|
||||
ESZ|(1<<ucp_gbT), /* 10 LVT */
|
||||
(1<<ucp_gbRegionalIndicator), /* 11 RegionalIndicator */
|
||||
ESZ, /* 12 Other */
|
||||
ESZ|(1<<ucp_gbE_Modifier), /* 13 E_Base */
|
||||
ESZ, /* 14 E_Modifier */
|
||||
ESZ|(1<<ucp_gbE_Modifier), /* 15 E_Base_GAZ */
|
||||
ESZ|(1<<ucp_gbGlue_After_Zwj)|(1<<ucp_gbE_Base_GAZ), /* 16 ZWJ */
|
||||
ESZ /* 12 Glue_After_Zwj */
|
||||
ESZ|(1<<ucp_gbV)|(1<<ucp_gbT), /* 7 V */
|
||||
ESZ|(1<<ucp_gbT), /* 8 T */
|
||||
ESZ|(1<<ucp_gbV)|(1<<ucp_gbT), /* 9 LV */
|
||||
ESZ|(1<<ucp_gbT), /* 10 LVT */
|
||||
(1<<ucp_gbRegionalIndicator), /* 11 RegionalIndicator */
|
||||
ESZ, /* 12 Other */
|
||||
ESZ, /* 13 ZWJ */
|
||||
ESZ|(1<<ucp_gbExtended_Pictographic) /* 14 Extended Pictographic */
|
||||
};
|
||||
|
||||
#undef ESZ
|
||||
|
@ -282,6 +275,7 @@ strings to make sure that UTF-8 support works on EBCDIC platforms. */
|
|||
#define STRING_Cyrillic0 STR_C STR_y STR_r STR_i STR_l STR_l STR_i STR_c "\0"
|
||||
#define STRING_Deseret0 STR_D STR_e STR_s STR_e STR_r STR_e STR_t "\0"
|
||||
#define STRING_Devanagari0 STR_D STR_e STR_v STR_a STR_n STR_a STR_g STR_a STR_r STR_i "\0"
|
||||
#define STRING_Dogra0 STR_D STR_o STR_g STR_r STR_a "\0"
|
||||
#define STRING_Duployan0 STR_D STR_u STR_p STR_l STR_o STR_y STR_a STR_n "\0"
|
||||
#define STRING_Egyptian_Hieroglyphs0 STR_E STR_g STR_y STR_p STR_t STR_i STR_a STR_n STR_UNDERSCORE STR_H STR_i STR_e STR_r STR_o STR_g STR_l STR_y STR_p STR_h STR_s "\0"
|
||||
#define STRING_Elbasan0 STR_E STR_l STR_b STR_a STR_s STR_a STR_n "\0"
|
||||
|
@ -292,9 +286,11 @@ strings to make sure that UTF-8 support works on EBCDIC platforms. */
|
|||
#define STRING_Grantha0 STR_G STR_r STR_a STR_n STR_t STR_h STR_a "\0"
|
||||
#define STRING_Greek0 STR_G STR_r STR_e STR_e STR_k "\0"
|
||||
#define STRING_Gujarati0 STR_G STR_u STR_j STR_a STR_r STR_a STR_t STR_i "\0"
|
||||
#define STRING_Gunjala_Gondi0 STR_G STR_u STR_n STR_j STR_a STR_l STR_a STR_UNDERSCORE STR_G STR_o STR_n STR_d STR_i "\0"
|
||||
#define STRING_Gurmukhi0 STR_G STR_u STR_r STR_m STR_u STR_k STR_h STR_i "\0"
|
||||
#define STRING_Han0 STR_H STR_a STR_n "\0"
|
||||
#define STRING_Hangul0 STR_H STR_a STR_n STR_g STR_u STR_l "\0"
|
||||
#define STRING_Hanifi_Rohingya0 STR_H STR_a STR_n STR_i STR_f STR_i STR_UNDERSCORE STR_R STR_o STR_h STR_i STR_n STR_g STR_y STR_a "\0"
|
||||
#define STRING_Hanunoo0 STR_H STR_a STR_n STR_u STR_n STR_o STR_o "\0"
|
||||
#define STRING_Hatran0 STR_H STR_a STR_t STR_r STR_a STR_n "\0"
|
||||
#define STRING_Hebrew0 STR_H STR_e STR_b STR_r STR_e STR_w "\0"
|
||||
|
@ -330,6 +326,7 @@ strings to make sure that UTF-8 support works on EBCDIC platforms. */
|
|||
#define STRING_Lydian0 STR_L STR_y STR_d STR_i STR_a STR_n "\0"
|
||||
#define STRING_M0 STR_M "\0"
|
||||
#define STRING_Mahajani0 STR_M STR_a STR_h STR_a STR_j STR_a STR_n STR_i "\0"
|
||||
#define STRING_Makasar0 STR_M STR_a STR_k STR_a STR_s STR_a STR_r "\0"
|
||||
#define STRING_Malayalam0 STR_M STR_a STR_l STR_a STR_y STR_a STR_l STR_a STR_m "\0"
|
||||
#define STRING_Mandaic0 STR_M STR_a STR_n STR_d STR_a STR_i STR_c "\0"
|
||||
#define STRING_Manichaean0 STR_M STR_a STR_n STR_i STR_c STR_h STR_a STR_e STR_a STR_n "\0"
|
||||
|
@ -337,6 +334,7 @@ strings to make sure that UTF-8 support works on EBCDIC platforms. */
|
|||
#define STRING_Masaram_Gondi0 STR_M STR_a STR_s STR_a STR_r STR_a STR_m STR_UNDERSCORE STR_G STR_o STR_n STR_d STR_i "\0"
|
||||
#define STRING_Mc0 STR_M STR_c "\0"
|
||||
#define STRING_Me0 STR_M STR_e "\0"
|
||||
#define STRING_Medefaidrin0 STR_M STR_e STR_d STR_e STR_f STR_a STR_i STR_d STR_r STR_i STR_n "\0"
|
||||
#define STRING_Meetei_Mayek0 STR_M STR_e STR_e STR_t STR_e STR_i STR_UNDERSCORE STR_M STR_a STR_y STR_e STR_k "\0"
|
||||
#define STRING_Mende_Kikakui0 STR_M STR_e STR_n STR_d STR_e STR_UNDERSCORE STR_K STR_i STR_k STR_a STR_k STR_u STR_i "\0"
|
||||
#define STRING_Meroitic_Cursive0 STR_M STR_e STR_r STR_o STR_i STR_t STR_i STR_c STR_UNDERSCORE STR_C STR_u STR_r STR_s STR_i STR_v STR_e "\0"
|
||||
|
@ -364,6 +362,7 @@ strings to make sure that UTF-8 support works on EBCDIC platforms. */
|
|||
#define STRING_Old_North_Arabian0 STR_O STR_l STR_d STR_UNDERSCORE STR_N STR_o STR_r STR_t STR_h STR_UNDERSCORE STR_A STR_r STR_a STR_b STR_i STR_a STR_n "\0"
|
||||
#define STRING_Old_Permic0 STR_O STR_l STR_d STR_UNDERSCORE STR_P STR_e STR_r STR_m STR_i STR_c "\0"
|
||||
#define STRING_Old_Persian0 STR_O STR_l STR_d STR_UNDERSCORE STR_P STR_e STR_r STR_s STR_i STR_a STR_n "\0"
|
||||
#define STRING_Old_Sogdian0 STR_O STR_l STR_d STR_UNDERSCORE STR_S STR_o STR_g STR_d STR_i STR_a STR_n "\0"
|
||||
#define STRING_Old_South_Arabian0 STR_O STR_l STR_d STR_UNDERSCORE STR_S STR_o STR_u STR_t STR_h STR_UNDERSCORE STR_A STR_r STR_a STR_b STR_i STR_a STR_n "\0"
|
||||
#define STRING_Old_Turkic0 STR_O STR_l STR_d STR_UNDERSCORE STR_T STR_u STR_r STR_k STR_i STR_c "\0"
|
||||
#define STRING_Oriya0 STR_O STR_r STR_i STR_y STR_a "\0"
|
||||
|
@ -397,6 +396,7 @@ strings to make sure that UTF-8 support works on EBCDIC platforms. */
|
|||
#define STRING_Sk0 STR_S STR_k "\0"
|
||||
#define STRING_Sm0 STR_S STR_m "\0"
|
||||
#define STRING_So0 STR_S STR_o "\0"
|
||||
#define STRING_Sogdian0 STR_S STR_o STR_g STR_d STR_i STR_a STR_n "\0"
|
||||
#define STRING_Sora_Sompeng0 STR_S STR_o STR_r STR_a STR_UNDERSCORE STR_S STR_o STR_m STR_p STR_e STR_n STR_g "\0"
|
||||
#define STRING_Soyombo0 STR_S STR_o STR_y STR_o STR_m STR_b STR_o "\0"
|
||||
#define STRING_Sundanese0 STR_S STR_u STR_n STR_d STR_a STR_n STR_e STR_s STR_e "\0"
|
||||
|
@ -469,6 +469,7 @@ const char PRIV(utt_names)[] =
|
|||
STRING_Cyrillic0
|
||||
STRING_Deseret0
|
||||
STRING_Devanagari0
|
||||
STRING_Dogra0
|
||||
STRING_Duployan0
|
||||
STRING_Egyptian_Hieroglyphs0
|
||||
STRING_Elbasan0
|
||||
|
@ -479,9 +480,11 @@ const char PRIV(utt_names)[] =
|
|||
STRING_Grantha0
|
||||
STRING_Greek0
|
||||
STRING_Gujarati0
|
||||
STRING_Gunjala_Gondi0
|
||||
STRING_Gurmukhi0
|
||||
STRING_Han0
|
||||
STRING_Hangul0
|
||||
STRING_Hanifi_Rohingya0
|
||||
STRING_Hanunoo0
|
||||
STRING_Hatran0
|
||||
STRING_Hebrew0
|
||||
|
@ -517,6 +520,7 @@ const char PRIV(utt_names)[] =
|
|||
STRING_Lydian0
|
||||
STRING_M0
|
||||
STRING_Mahajani0
|
||||
STRING_Makasar0
|
||||
STRING_Malayalam0
|
||||
STRING_Mandaic0
|
||||
STRING_Manichaean0
|
||||
|
@ -524,6 +528,7 @@ const char PRIV(utt_names)[] =
|
|||
STRING_Masaram_Gondi0
|
||||
STRING_Mc0
|
||||
STRING_Me0
|
||||
STRING_Medefaidrin0
|
||||
STRING_Meetei_Mayek0
|
||||
STRING_Mende_Kikakui0
|
||||
STRING_Meroitic_Cursive0
|
||||
|
@ -551,6 +556,7 @@ const char PRIV(utt_names)[] =
|
|||
STRING_Old_North_Arabian0
|
||||
STRING_Old_Permic0
|
||||
STRING_Old_Persian0
|
||||
STRING_Old_Sogdian0
|
||||
STRING_Old_South_Arabian0
|
||||
STRING_Old_Turkic0
|
||||
STRING_Oriya0
|
||||
|
@ -584,6 +590,7 @@ const char PRIV(utt_names)[] =
|
|||
STRING_Sk0
|
||||
STRING_Sm0
|
||||
STRING_So0
|
||||
STRING_Sogdian0
|
||||
STRING_Sora_Sompeng0
|
||||
STRING_Soyombo0
|
||||
STRING_Sundanese0
|
||||
|
@ -656,154 +663,161 @@ const ucp_type_table PRIV(utt)[] = {
|
|||
{ 265, PT_SC, ucp_Cyrillic },
|
||||
{ 274, PT_SC, ucp_Deseret },
|
||||
{ 282, PT_SC, ucp_Devanagari },
|
||||
{ 293, PT_SC, ucp_Duployan },
|
||||
{ 302, PT_SC, ucp_Egyptian_Hieroglyphs },
|
||||
{ 323, PT_SC, ucp_Elbasan },
|
||||
{ 331, PT_SC, ucp_Ethiopic },
|
||||
{ 340, PT_SC, ucp_Georgian },
|
||||
{ 349, PT_SC, ucp_Glagolitic },
|
||||
{ 360, PT_SC, ucp_Gothic },
|
||||
{ 367, PT_SC, ucp_Grantha },
|
||||
{ 375, PT_SC, ucp_Greek },
|
||||
{ 381, PT_SC, ucp_Gujarati },
|
||||
{ 390, PT_SC, ucp_Gurmukhi },
|
||||
{ 399, PT_SC, ucp_Han },
|
||||
{ 403, PT_SC, ucp_Hangul },
|
||||
{ 410, PT_SC, ucp_Hanunoo },
|
||||
{ 418, PT_SC, ucp_Hatran },
|
||||
{ 425, PT_SC, ucp_Hebrew },
|
||||
{ 432, PT_SC, ucp_Hiragana },
|
||||
{ 441, PT_SC, ucp_Imperial_Aramaic },
|
||||
{ 458, PT_SC, ucp_Inherited },
|
||||
{ 468, PT_SC, ucp_Inscriptional_Pahlavi },
|
||||
{ 490, PT_SC, ucp_Inscriptional_Parthian },
|
||||
{ 513, PT_SC, ucp_Javanese },
|
||||
{ 522, PT_SC, ucp_Kaithi },
|
||||
{ 529, PT_SC, ucp_Kannada },
|
||||
{ 537, PT_SC, ucp_Katakana },
|
||||
{ 546, PT_SC, ucp_Kayah_Li },
|
||||
{ 555, PT_SC, ucp_Kharoshthi },
|
||||
{ 566, PT_SC, ucp_Khmer },
|
||||
{ 572, PT_SC, ucp_Khojki },
|
||||
{ 579, PT_SC, ucp_Khudawadi },
|
||||
{ 589, PT_GC, ucp_L },
|
||||
{ 591, PT_LAMP, 0 },
|
||||
{ 594, PT_SC, ucp_Lao },
|
||||
{ 598, PT_SC, ucp_Latin },
|
||||
{ 604, PT_SC, ucp_Lepcha },
|
||||
{ 611, PT_SC, ucp_Limbu },
|
||||
{ 617, PT_SC, ucp_Linear_A },
|
||||
{ 626, PT_SC, ucp_Linear_B },
|
||||
{ 635, PT_SC, ucp_Lisu },
|
||||
{ 640, PT_PC, ucp_Ll },
|
||||
{ 643, PT_PC, ucp_Lm },
|
||||
{ 646, PT_PC, ucp_Lo },
|
||||
{ 649, PT_PC, ucp_Lt },
|
||||
{ 652, PT_PC, ucp_Lu },
|
||||
{ 655, PT_SC, ucp_Lycian },
|
||||
{ 662, PT_SC, ucp_Lydian },
|
||||
{ 669, PT_GC, ucp_M },
|
||||
{ 671, PT_SC, ucp_Mahajani },
|
||||
{ 680, PT_SC, ucp_Malayalam },
|
||||
{ 690, PT_SC, ucp_Mandaic },
|
||||
{ 698, PT_SC, ucp_Manichaean },
|
||||
{ 709, PT_SC, ucp_Marchen },
|
||||
{ 717, PT_SC, ucp_Masaram_Gondi },
|
||||
{ 731, PT_PC, ucp_Mc },
|
||||
{ 734, PT_PC, ucp_Me },
|
||||
{ 737, PT_SC, ucp_Meetei_Mayek },
|
||||
{ 750, PT_SC, ucp_Mende_Kikakui },
|
||||
{ 764, PT_SC, ucp_Meroitic_Cursive },
|
||||
{ 781, PT_SC, ucp_Meroitic_Hieroglyphs },
|
||||
{ 802, PT_SC, ucp_Miao },
|
||||
{ 807, PT_PC, ucp_Mn },
|
||||
{ 810, PT_SC, ucp_Modi },
|
||||
{ 815, PT_SC, ucp_Mongolian },
|
||||
{ 825, PT_SC, ucp_Mro },
|
||||
{ 829, PT_SC, ucp_Multani },
|
||||
{ 837, PT_SC, ucp_Myanmar },
|
||||
{ 845, PT_GC, ucp_N },
|
||||
{ 847, PT_SC, ucp_Nabataean },
|
||||
{ 857, PT_PC, ucp_Nd },
|
||||
{ 860, PT_SC, ucp_New_Tai_Lue },
|
||||
{ 872, PT_SC, ucp_Newa },
|
||||
{ 877, PT_SC, ucp_Nko },
|
||||
{ 881, PT_PC, ucp_Nl },
|
||||
{ 884, PT_PC, ucp_No },
|
||||
{ 887, PT_SC, ucp_Nushu },
|
||||
{ 893, PT_SC, ucp_Ogham },
|
||||
{ 899, PT_SC, ucp_Ol_Chiki },
|
||||
{ 908, PT_SC, ucp_Old_Hungarian },
|
||||
{ 922, PT_SC, ucp_Old_Italic },
|
||||
{ 933, PT_SC, ucp_Old_North_Arabian },
|
||||
{ 951, PT_SC, ucp_Old_Permic },
|
||||
{ 962, PT_SC, ucp_Old_Persian },
|
||||
{ 974, PT_SC, ucp_Old_South_Arabian },
|
||||
{ 992, PT_SC, ucp_Old_Turkic },
|
||||
{ 1003, PT_SC, ucp_Oriya },
|
||||
{ 1009, PT_SC, ucp_Osage },
|
||||
{ 1015, PT_SC, ucp_Osmanya },
|
||||
{ 1023, PT_GC, ucp_P },
|
||||
{ 1025, PT_SC, ucp_Pahawh_Hmong },
|
||||
{ 1038, PT_SC, ucp_Palmyrene },
|
||||
{ 1048, PT_SC, ucp_Pau_Cin_Hau },
|
||||
{ 1060, PT_PC, ucp_Pc },
|
||||
{ 1063, PT_PC, ucp_Pd },
|
||||
{ 1066, PT_PC, ucp_Pe },
|
||||
{ 1069, PT_PC, ucp_Pf },
|
||||
{ 1072, PT_SC, ucp_Phags_Pa },
|
||||
{ 1081, PT_SC, ucp_Phoenician },
|
||||
{ 1092, PT_PC, ucp_Pi },
|
||||
{ 1095, PT_PC, ucp_Po },
|
||||
{ 1098, PT_PC, ucp_Ps },
|
||||
{ 1101, PT_SC, ucp_Psalter_Pahlavi },
|
||||
{ 1117, PT_SC, ucp_Rejang },
|
||||
{ 1124, PT_SC, ucp_Runic },
|
||||
{ 1130, PT_GC, ucp_S },
|
||||
{ 1132, PT_SC, ucp_Samaritan },
|
||||
{ 1142, PT_SC, ucp_Saurashtra },
|
||||
{ 1153, PT_PC, ucp_Sc },
|
||||
{ 1156, PT_SC, ucp_Sharada },
|
||||
{ 1164, PT_SC, ucp_Shavian },
|
||||
{ 1172, PT_SC, ucp_Siddham },
|
||||
{ 1180, PT_SC, ucp_SignWriting },
|
||||
{ 1192, PT_SC, ucp_Sinhala },
|
||||
{ 1200, PT_PC, ucp_Sk },
|
||||
{ 1203, PT_PC, ucp_Sm },
|
||||
{ 1206, PT_PC, ucp_So },
|
||||
{ 1209, PT_SC, ucp_Sora_Sompeng },
|
||||
{ 1222, PT_SC, ucp_Soyombo },
|
||||
{ 1230, PT_SC, ucp_Sundanese },
|
||||
{ 1240, PT_SC, ucp_Syloti_Nagri },
|
||||
{ 1253, PT_SC, ucp_Syriac },
|
||||
{ 1260, PT_SC, ucp_Tagalog },
|
||||
{ 1268, PT_SC, ucp_Tagbanwa },
|
||||
{ 1277, PT_SC, ucp_Tai_Le },
|
||||
{ 1284, PT_SC, ucp_Tai_Tham },
|
||||
{ 1293, PT_SC, ucp_Tai_Viet },
|
||||
{ 1302, PT_SC, ucp_Takri },
|
||||
{ 1308, PT_SC, ucp_Tamil },
|
||||
{ 1314, PT_SC, ucp_Tangut },
|
||||
{ 1321, PT_SC, ucp_Telugu },
|
||||
{ 1328, PT_SC, ucp_Thaana },
|
||||
{ 1335, PT_SC, ucp_Thai },
|
||||
{ 1340, PT_SC, ucp_Tibetan },
|
||||
{ 1348, PT_SC, ucp_Tifinagh },
|
||||
{ 1357, PT_SC, ucp_Tirhuta },
|
||||
{ 1365, PT_SC, ucp_Ugaritic },
|
||||
{ 1374, PT_SC, ucp_Vai },
|
||||
{ 1378, PT_SC, ucp_Warang_Citi },
|
||||
{ 1390, PT_ALNUM, 0 },
|
||||
{ 1394, PT_PXSPACE, 0 },
|
||||
{ 1398, PT_SPACE, 0 },
|
||||
{ 1402, PT_UCNC, 0 },
|
||||
{ 1406, PT_WORD, 0 },
|
||||
{ 1410, PT_SC, ucp_Yi },
|
||||
{ 1413, PT_GC, ucp_Z },
|
||||
{ 1415, PT_SC, ucp_Zanabazar_Square },
|
||||
{ 1432, PT_PC, ucp_Zl },
|
||||
{ 1435, PT_PC, ucp_Zp },
|
||||
{ 1438, PT_PC, ucp_Zs }
|
||||
{ 293, PT_SC, ucp_Dogra },
|
||||
{ 299, PT_SC, ucp_Duployan },
|
||||
{ 308, PT_SC, ucp_Egyptian_Hieroglyphs },
|
||||
{ 329, PT_SC, ucp_Elbasan },
|
||||
{ 337, PT_SC, ucp_Ethiopic },
|
||||
{ 346, PT_SC, ucp_Georgian },
|
||||
{ 355, PT_SC, ucp_Glagolitic },
|
||||
{ 366, PT_SC, ucp_Gothic },
|
||||
{ 373, PT_SC, ucp_Grantha },
|
||||
{ 381, PT_SC, ucp_Greek },
|
||||
{ 387, PT_SC, ucp_Gujarati },
|
||||
{ 396, PT_SC, ucp_Gunjala_Gondi },
|
||||
{ 410, PT_SC, ucp_Gurmukhi },
|
||||
{ 419, PT_SC, ucp_Han },
|
||||
{ 423, PT_SC, ucp_Hangul },
|
||||
{ 430, PT_SC, ucp_Hanifi_Rohingya },
|
||||
{ 446, PT_SC, ucp_Hanunoo },
|
||||
{ 454, PT_SC, ucp_Hatran },
|
||||
{ 461, PT_SC, ucp_Hebrew },
|
||||
{ 468, PT_SC, ucp_Hiragana },
|
||||
{ 477, PT_SC, ucp_Imperial_Aramaic },
|
||||
{ 494, PT_SC, ucp_Inherited },
|
||||
{ 504, PT_SC, ucp_Inscriptional_Pahlavi },
|
||||
{ 526, PT_SC, ucp_Inscriptional_Parthian },
|
||||
{ 549, PT_SC, ucp_Javanese },
|
||||
{ 558, PT_SC, ucp_Kaithi },
|
||||
{ 565, PT_SC, ucp_Kannada },
|
||||
{ 573, PT_SC, ucp_Katakana },
|
||||
{ 582, PT_SC, ucp_Kayah_Li },
|
||||
{ 591, PT_SC, ucp_Kharoshthi },
|
||||
{ 602, PT_SC, ucp_Khmer },
|
||||
{ 608, PT_SC, ucp_Khojki },
|
||||
{ 615, PT_SC, ucp_Khudawadi },
|
||||
{ 625, PT_GC, ucp_L },
|
||||
{ 627, PT_LAMP, 0 },
|
||||
{ 630, PT_SC, ucp_Lao },
|
||||
{ 634, PT_SC, ucp_Latin },
|
||||
{ 640, PT_SC, ucp_Lepcha },
|
||||
{ 647, PT_SC, ucp_Limbu },
|
||||
{ 653, PT_SC, ucp_Linear_A },
|
||||
{ 662, PT_SC, ucp_Linear_B },
|
||||
{ 671, PT_SC, ucp_Lisu },
|
||||
{ 676, PT_PC, ucp_Ll },
|
||||
{ 679, PT_PC, ucp_Lm },
|
||||
{ 682, PT_PC, ucp_Lo },
|
||||
{ 685, PT_PC, ucp_Lt },
|
||||
{ 688, PT_PC, ucp_Lu },
|
||||
{ 691, PT_SC, ucp_Lycian },
|
||||
{ 698, PT_SC, ucp_Lydian },
|
||||
{ 705, PT_GC, ucp_M },
|
||||
{ 707, PT_SC, ucp_Mahajani },
|
||||
{ 716, PT_SC, ucp_Makasar },
|
||||
{ 724, PT_SC, ucp_Malayalam },
|
||||
{ 734, PT_SC, ucp_Mandaic },
|
||||
{ 742, PT_SC, ucp_Manichaean },
|
||||
{ 753, PT_SC, ucp_Marchen },
|
||||
{ 761, PT_SC, ucp_Masaram_Gondi },
|
||||
{ 775, PT_PC, ucp_Mc },
|
||||
{ 778, PT_PC, ucp_Me },
|
||||
{ 781, PT_SC, ucp_Medefaidrin },
|
||||
{ 793, PT_SC, ucp_Meetei_Mayek },
|
||||
{ 806, PT_SC, ucp_Mende_Kikakui },
|
||||
{ 820, PT_SC, ucp_Meroitic_Cursive },
|
||||
{ 837, PT_SC, ucp_Meroitic_Hieroglyphs },
|
||||
{ 858, PT_SC, ucp_Miao },
|
||||
{ 863, PT_PC, ucp_Mn },
|
||||
{ 866, PT_SC, ucp_Modi },
|
||||
{ 871, PT_SC, ucp_Mongolian },
|
||||
{ 881, PT_SC, ucp_Mro },
|
||||
{ 885, PT_SC, ucp_Multani },
|
||||
{ 893, PT_SC, ucp_Myanmar },
|
||||
{ 901, PT_GC, ucp_N },
|
||||
{ 903, PT_SC, ucp_Nabataean },
|
||||
{ 913, PT_PC, ucp_Nd },
|
||||
{ 916, PT_SC, ucp_New_Tai_Lue },
|
||||
{ 928, PT_SC, ucp_Newa },
|
||||
{ 933, PT_SC, ucp_Nko },
|
||||
{ 937, PT_PC, ucp_Nl },
|
||||
{ 940, PT_PC, ucp_No },
|
||||
{ 943, PT_SC, ucp_Nushu },
|
||||
{ 949, PT_SC, ucp_Ogham },
|
||||
{ 955, PT_SC, ucp_Ol_Chiki },
|
||||
{ 964, PT_SC, ucp_Old_Hungarian },
|
||||
{ 978, PT_SC, ucp_Old_Italic },
|
||||
{ 989, PT_SC, ucp_Old_North_Arabian },
|
||||
{ 1007, PT_SC, ucp_Old_Permic },
|
||||
{ 1018, PT_SC, ucp_Old_Persian },
|
||||
{ 1030, PT_SC, ucp_Old_Sogdian },
|
||||
{ 1042, PT_SC, ucp_Old_South_Arabian },
|
||||
{ 1060, PT_SC, ucp_Old_Turkic },
|
||||
{ 1071, PT_SC, ucp_Oriya },
|
||||
{ 1077, PT_SC, ucp_Osage },
|
||||
{ 1083, PT_SC, ucp_Osmanya },
|
||||
{ 1091, PT_GC, ucp_P },
|
||||
{ 1093, PT_SC, ucp_Pahawh_Hmong },
|
||||
{ 1106, PT_SC, ucp_Palmyrene },
|
||||
{ 1116, PT_SC, ucp_Pau_Cin_Hau },
|
||||
{ 1128, PT_PC, ucp_Pc },
|
||||
{ 1131, PT_PC, ucp_Pd },
|
||||
{ 1134, PT_PC, ucp_Pe },
|
||||
{ 1137, PT_PC, ucp_Pf },
|
||||
{ 1140, PT_SC, ucp_Phags_Pa },
|
||||
{ 1149, PT_SC, ucp_Phoenician },
|
||||
{ 1160, PT_PC, ucp_Pi },
|
||||
{ 1163, PT_PC, ucp_Po },
|
||||
{ 1166, PT_PC, ucp_Ps },
|
||||
{ 1169, PT_SC, ucp_Psalter_Pahlavi },
|
||||
{ 1185, PT_SC, ucp_Rejang },
|
||||
{ 1192, PT_SC, ucp_Runic },
|
||||
{ 1198, PT_GC, ucp_S },
|
||||
{ 1200, PT_SC, ucp_Samaritan },
|
||||
{ 1210, PT_SC, ucp_Saurashtra },
|
||||
{ 1221, PT_PC, ucp_Sc },
|
||||
{ 1224, PT_SC, ucp_Sharada },
|
||||
{ 1232, PT_SC, ucp_Shavian },
|
||||
{ 1240, PT_SC, ucp_Siddham },
|
||||
{ 1248, PT_SC, ucp_SignWriting },
|
||||
{ 1260, PT_SC, ucp_Sinhala },
|
||||
{ 1268, PT_PC, ucp_Sk },
|
||||
{ 1271, PT_PC, ucp_Sm },
|
||||
{ 1274, PT_PC, ucp_So },
|
||||
{ 1277, PT_SC, ucp_Sogdian },
|
||||
{ 1285, PT_SC, ucp_Sora_Sompeng },
|
||||
{ 1298, PT_SC, ucp_Soyombo },
|
||||
{ 1306, PT_SC, ucp_Sundanese },
|
||||
{ 1316, PT_SC, ucp_Syloti_Nagri },
|
||||
{ 1329, PT_SC, ucp_Syriac },
|
||||
{ 1336, PT_SC, ucp_Tagalog },
|
||||
{ 1344, PT_SC, ucp_Tagbanwa },
|
||||
{ 1353, PT_SC, ucp_Tai_Le },
|
||||
{ 1360, PT_SC, ucp_Tai_Tham },
|
||||
{ 1369, PT_SC, ucp_Tai_Viet },
|
||||
{ 1378, PT_SC, ucp_Takri },
|
||||
{ 1384, PT_SC, ucp_Tamil },
|
||||
{ 1390, PT_SC, ucp_Tangut },
|
||||
{ 1397, PT_SC, ucp_Telugu },
|
||||
{ 1404, PT_SC, ucp_Thaana },
|
||||
{ 1411, PT_SC, ucp_Thai },
|
||||
{ 1416, PT_SC, ucp_Tibetan },
|
||||
{ 1424, PT_SC, ucp_Tifinagh },
|
||||
{ 1433, PT_SC, ucp_Tirhuta },
|
||||
{ 1441, PT_SC, ucp_Ugaritic },
|
||||
{ 1450, PT_SC, ucp_Vai },
|
||||
{ 1454, PT_SC, ucp_Warang_Citi },
|
||||
{ 1466, PT_ALNUM, 0 },
|
||||
{ 1470, PT_PXSPACE, 0 },
|
||||
{ 1474, PT_SPACE, 0 },
|
||||
{ 1478, PT_UCNC, 0 },
|
||||
{ 1482, PT_WORD, 0 },
|
||||
{ 1486, PT_SC, ucp_Yi },
|
||||
{ 1489, PT_GC, ucp_Z },
|
||||
{ 1491, PT_SC, ucp_Zanabazar_Square },
|
||||
{ 1508, PT_PC, ucp_Zl },
|
||||
{ 1511, PT_PC, ucp_Zp },
|
||||
{ 1514, PT_PC, ucp_Zs }
|
||||
};
|
||||
|
||||
const size_t PRIV(utt_size) = sizeof(PRIV(utt)) / sizeof(ucp_type_table);
|
||||
|
|
6751
src/pcre2_ucd.c
6751
src/pcre2_ucd.c
File diff suppressed because it is too large
Load Diff
|
@ -7,7 +7,7 @@ and semantics are as close as possible to those of the Perl 5 language.
|
|||
|
||||
Written by Philip Hazel
|
||||
Original API code Copyright (c) 1997-2012 University of Cambridge
|
||||
New API code Copyright (c) 2016 University of Cambridge
|
||||
New API code Copyright (c) 2016-2018 University of Cambridge
|
||||
|
||||
-----------------------------------------------------------------------------
|
||||
Redistribution and use in source and binary forms, with or without
|
||||
|
@ -100,27 +100,25 @@ enum {
|
|||
ucp_Zs /* Space separator */
|
||||
};
|
||||
|
||||
/* These are grapheme break properties. */
|
||||
/* These are grapheme break properties. The Extended Pictographic property
|
||||
comes from the emoji-data.txt file. */
|
||||
|
||||
enum {
|
||||
ucp_gbCR, /* 0 */
|
||||
ucp_gbLF, /* 1 */
|
||||
ucp_gbControl, /* 2 */
|
||||
ucp_gbExtend, /* 3 */
|
||||
ucp_gbPrepend, /* 4 */
|
||||
ucp_gbSpacingMark, /* 5 */
|
||||
ucp_gbL, /* 6 Hangul syllable type L */
|
||||
ucp_gbV, /* 7 Hangul syllable type V */
|
||||
ucp_gbT, /* 8 Hangul syllable type T */
|
||||
ucp_gbLV, /* 9 Hangul syllable type LV */
|
||||
ucp_gbLVT, /* 10 Hangul syllable type LVT */
|
||||
ucp_gbRegionalIndicator, /* 11 */
|
||||
ucp_gbOther, /* 12 */
|
||||
ucp_gbE_Base, /* 13 */
|
||||
ucp_gbE_Modifier, /* 14 */
|
||||
ucp_gbE_Base_GAZ, /* 15 */
|
||||
ucp_gbZWJ, /* 16 */
|
||||
ucp_gbGlue_After_Zwj /* 17 */
|
||||
ucp_gbCR, /* 0 */
|
||||
ucp_gbLF, /* 1 */
|
||||
ucp_gbControl, /* 2 */
|
||||
ucp_gbExtend, /* 3 */
|
||||
ucp_gbPrepend, /* 4 */
|
||||
ucp_gbSpacingMark, /* 5 */
|
||||
ucp_gbL, /* 6 Hangul syllable type L */
|
||||
ucp_gbV, /* 7 Hangul syllable type V */
|
||||
ucp_gbT, /* 8 Hangul syllable type T */
|
||||
ucp_gbLV, /* 9 Hangul syllable type LV */
|
||||
ucp_gbLVT, /* 10 Hangul syllable type LVT */
|
||||
ucp_gbRegionalIndicator, /* 11 */
|
||||
ucp_gbOther, /* 12 */
|
||||
ucp_gbZWJ, /* 13 */
|
||||
ucp_gbExtended_Pictographic /* 14 */
|
||||
};
|
||||
|
||||
/* These are the script identifications. */
|
||||
|
@ -274,7 +272,15 @@ enum {
|
|||
ucp_Masaram_Gondi,
|
||||
ucp_Nushu,
|
||||
ucp_Soyombo,
|
||||
ucp_Zanabazar_Square
|
||||
ucp_Zanabazar_Square,
|
||||
/* New for Unicode 11.0.0 */
|
||||
ucp_Dogra,
|
||||
ucp_Gunjala_Gondi,
|
||||
ucp_Hanifi_Rohingya,
|
||||
ucp_Makasar,
|
||||
ucp_Medefaidrin,
|
||||
ucp_Old_Sogdian,
|
||||
ucp_Sogdian
|
||||
};
|
||||
|
||||
#endif /* PCRE2_UCP_H_IDEMPOTENT_GUARD */
|
||||
|
|
|
@ -1394,28 +1394,15 @@
|
|||
\x{6e9}
|
||||
\x{6ef}
|
||||
\x{6fa}
|
||||
\= Expect no match
|
||||
\x{650}
|
||||
\x{651}
|
||||
\x{652}
|
||||
\x{653}
|
||||
\x{654}
|
||||
\x{655}
|
||||
|
||||
/^\p{Cyrillic}/utf
|
||||
\x{1d2b}
|
||||
|
||||
/^\p{Common}/utf
|
||||
\x{589}
|
||||
\x{60c}
|
||||
\x{61f}
|
||||
\x{964}
|
||||
\x{965}
|
||||
\x{2116}
|
||||
\x{1D183}
|
||||
|
||||
/^\p{Inherited}/utf
|
||||
\x{64b}
|
||||
\x{654}
|
||||
\x{655}
|
||||
\x{200c}
|
||||
\= Expect no match
|
||||
\x{64a}
|
||||
|
|
|
@ -2030,8 +2030,8 @@
|
|||
# to test 4.
|
||||
|
||||
/^(\p{Adlam}+)(\p{Bhaiksuki}+)(\p{Marchen}+)(\p{Newa}+)(\p{Osage}+)
|
||||
(\p{Tangut}+)(\p{Masaram_Gondi}+)(\p{Nushu}+)(\p{Soyombo}+)
|
||||
(\p{Zanabazar_Square}+)/x,utf
|
||||
(\p{Tangut}+)(\p{Masaram_Gondi}+)(\p{Nushu}+)(\p{Soyombo}+)
|
||||
(\p{Zanabazar_Square}+)/x,utf
|
||||
\x{1E900}\x{1E924}\x{1E953}\x{11C00}\x{11C2D}\x{11C3E}\x{11C70}\x{11C77}\x{11CAB}\x{11400}\x{1142F}\x{11455}\x{104B0}\x{104D8}\x{104FB}\x{16FE0}\x{18800}\x{18AF2}\x{11D00}\x{11D3A}\x{11D59}\x{16FE1}\x{1B170}\x{1B2FB}\x{11A50}\x{11A58}\x{11AA2}\x{11A00}\x{11A07}\x{11A47}
|
||||
|
||||
/^\x{1E900}\x{104B0}/i,utf
|
||||
|
@ -2041,17 +2041,21 @@
|
|||
/^(?:(\X)(?C))+$/utf
|
||||
\x{1E900}\x{1E924}\x{1E953}\x{11C00}\x{11C2D}\x{11C3E}\x{11C70}\x{11C77}\x{11CAB}\x{11400}\x{1142F}\x{11455}\x{104B0}\x{104D8}\x{104FB}\x{16FE0}\x{18800}\x{18AF2}\x{11D00}\x{11D3A}\x{11D59}\x{16FE1}\x{1B170}\x{1B2FB}\x{11A50}\x{11A58}\x{11AA2}\x{11A00}\x{11A07}\x{11A47}\=callout_capture,callout_no_where
|
||||
|
||||
# These two are here because JIT is not yet updated. Also, the very first data
|
||||
# line is handled differently by Perl.
|
||||
# Similarly for Unicode 11.0.0
|
||||
|
||||
/^(\p{Dogra}+)(\p{Gunjala_Gondi}+)(\p{Hanifi_Rohingya}+)(\p{Makasar}+)
|
||||
(\p{Medefaidrin}+)(\p{Old_Sogdian}+)(\p{Sogdian}+)/x,utf
|
||||
\x{11800}\x{11da9}\x{10d27}\x{11ee0}\x{16e48}\x{10f27}\x{10f30}
|
||||
|
||||
# These two are here because of differences from Perl.
|
||||
|
||||
/^\X/utf
|
||||
A\x{200d}B A ZWJ
|
||||
\x{261D}\x{1F3FB}B E_Base E_Modifier
|
||||
\x{1F466}\x{1F3FF}B E_Base_GAZ E_Modifier
|
||||
\x{200d}\x{1F3A4}B ZWJ Glue_After_ZWJ
|
||||
\x{200d}\x{1F469}B ZWJ E_Base_GAZ
|
||||
\x{261d}\x{261d}B Extended_Pictographic Extended_Pictographic
|
||||
\x{261D}\x{1F3FB}B Extended_Pictographic Extend
|
||||
\x{1F1E6}\x{1F1E7}B RegionalIndicator RegionalIndicator
|
||||
\x{261D}\x{E0100}\x{1F3FB}B E_Base Extend E_Modifier
|
||||
\x{261D}\x{1F3FB}\x{261d}B Extended_Pictographic Extend E-P
|
||||
\x{261D}\x{1F3FB}\x{200d}\x{261d}B Extended_Pictographic Extend ZWJ E-P
|
||||
|
||||
# Regional indicators
|
||||
|
||||
|
@ -2059,5 +2063,28 @@
|
|||
\x{1F1E6}\x{1F1E7}\x{1F1E7}B
|
||||
\x{1F1E6}\x{1F1E7}\x{1F1E7}\x{1F1E6}B
|
||||
|
||||
# More differences from Perl
|
||||
|
||||
/^[\p{Arabic}]/utf
|
||||
\= Expect no match
|
||||
\x{650}
|
||||
\x{651}
|
||||
\x{652}
|
||||
\x{653}
|
||||
\x{654}
|
||||
\x{655}
|
||||
|
||||
/^\p{Common}/utf
|
||||
\x{589}
|
||||
\x{60c}
|
||||
\x{61f}
|
||||
\x{964}
|
||||
\x{965}
|
||||
|
||||
/^\p{Inherited}/utf
|
||||
\x{64b}
|
||||
\x{654}
|
||||
\x{655}
|
||||
\x{1D1AA}
|
||||
|
||||
# End of testinput5
|
||||
|
|
|
@ -2293,43 +2293,18 @@ No match
|
|||
0: \x{6ef}
|
||||
\x{6fa}
|
||||
0: \x{6fa}
|
||||
\= Expect no match
|
||||
\x{650}
|
||||
No match
|
||||
\x{651}
|
||||
No match
|
||||
\x{652}
|
||||
No match
|
||||
\x{653}
|
||||
No match
|
||||
\x{654}
|
||||
No match
|
||||
\x{655}
|
||||
No match
|
||||
|
||||
/^\p{Cyrillic}/utf
|
||||
\x{1d2b}
|
||||
0: \x{1d2b}
|
||||
|
||||
/^\p{Common}/utf
|
||||
\x{589}
|
||||
0: \x{589}
|
||||
\x{60c}
|
||||
0: \x{60c}
|
||||
\x{61f}
|
||||
0: \x{61f}
|
||||
\x{964}
|
||||
0: \x{964}
|
||||
\x{965}
|
||||
0: \x{965}
|
||||
\x{2116}
|
||||
0: \x{2116}
|
||||
\x{1D183}
|
||||
0: \x{1d183}
|
||||
|
||||
/^\p{Inherited}/utf
|
||||
\x{64b}
|
||||
0: \x{64b}
|
||||
\x{654}
|
||||
0: \x{654}
|
||||
\x{655}
|
||||
0: \x{655}
|
||||
\x{200c}
|
||||
0: \x{200c}
|
||||
\= Expect no match
|
||||
|
|
|
@ -4593,8 +4593,8 @@ No match
|
|||
# to test 4.
|
||||
|
||||
/^(\p{Adlam}+)(\p{Bhaiksuki}+)(\p{Marchen}+)(\p{Newa}+)(\p{Osage}+)
|
||||
(\p{Tangut}+)(\p{Masaram_Gondi}+)(\p{Nushu}+)(\p{Soyombo}+)
|
||||
(\p{Zanabazar_Square}+)/x,utf
|
||||
(\p{Tangut}+)(\p{Masaram_Gondi}+)(\p{Nushu}+)(\p{Soyombo}+)
|
||||
(\p{Zanabazar_Square}+)/x,utf
|
||||
\x{1E900}\x{1E924}\x{1E953}\x{11C00}\x{11C2D}\x{11C3E}\x{11C70}\x{11C77}\x{11CAB}\x{11400}\x{1142F}\x{11455}\x{104B0}\x{104D8}\x{104FB}\x{16FE0}\x{18800}\x{18AF2}\x{11D00}\x{11D3A}\x{11D59}\x{16FE1}\x{1B170}\x{1B2FB}\x{11A50}\x{11A58}\x{11AA2}\x{11A00}\x{11A07}\x{11A47}
|
||||
0: \x{1e900}\x{1e924}\x{1e953}\x{11c00}\x{11c2d}\x{11c3e}\x{11c70}\x{11c77}\x{11cab}\x{11400}\x{1142f}\x{11455}\x{104b0}\x{104d8}\x{104fb}\x{16fe0}\x{18800}\x{18af2}\x{11d00}\x{11d3a}\x{11d59}\x{16fe1}\x{1b170}\x{1b2fb}\x{11a50}\x{11a58}\x{11aa2}\x{11a00}\x{11a07}\x{11a47}
|
||||
1: \x{1e900}\x{1e924}\x{1e953}
|
||||
|
@ -4667,24 +4667,35 @@ Callout 0: last capture = 1
|
|||
0: \x{1e900}\x{1e924}\x{1e953}\x{11c00}\x{11c2d}\x{11c3e}\x{11c70}\x{11c77}\x{11cab}\x{11400}\x{1142f}\x{11455}\x{104b0}\x{104d8}\x{104fb}\x{16fe0}\x{18800}\x{18af2}\x{11d00}\x{11d3a}\x{11d59}\x{16fe1}\x{1b170}\x{1b2fb}\x{11a50}\x{11a58}\x{11aa2}\x{11a00}\x{11a07}\x{11a47}
|
||||
1: \x{11a00}\x{11a07}\x{11a47}
|
||||
|
||||
# These two are here because JIT is not yet updated. Also, the very first data
|
||||
# line is handled differently by Perl.
|
||||
# Similarly for Unicode 11.0.0
|
||||
|
||||
/^(\p{Dogra}+)(\p{Gunjala_Gondi}+)(\p{Hanifi_Rohingya}+)(\p{Makasar}+)
|
||||
(\p{Medefaidrin}+)(\p{Old_Sogdian}+)(\p{Sogdian}+)/x,utf
|
||||
\x{11800}\x{11da9}\x{10d27}\x{11ee0}\x{16e48}\x{10f27}\x{10f30}
|
||||
0: \x{11800}\x{11da9}\x{10d27}\x{11ee0}\x{16e48}\x{10f27}\x{10f30}
|
||||
1: \x{11800}
|
||||
2: \x{11da9}
|
||||
3: \x{10d27}
|
||||
4: \x{11ee0}
|
||||
5: \x{16e48}
|
||||
6: \x{10f27}
|
||||
7: \x{10f30}
|
||||
|
||||
# These two are here because of differences from Perl.
|
||||
|
||||
/^\X/utf
|
||||
A\x{200d}B A ZWJ
|
||||
0: A\x{200d}
|
||||
\x{261D}\x{1F3FB}B E_Base E_Modifier
|
||||
\x{261d}\x{261d}B Extended_Pictographic Extended_Pictographic
|
||||
0: \x{261d}\x{261d}
|
||||
\x{261D}\x{1F3FB}B Extended_Pictographic Extend
|
||||
0: \x{261d}\x{1f3fb}
|
||||
\x{1F466}\x{1F3FF}B E_Base_GAZ E_Modifier
|
||||
0: \x{1f466}\x{1f3ff}
|
||||
\x{200d}\x{1F3A4}B ZWJ Glue_After_ZWJ
|
||||
0: \x{200d}\x{1f3a4}
|
||||
\x{200d}\x{1F469}B ZWJ E_Base_GAZ
|
||||
0: \x{200d}\x{1f469}
|
||||
\x{1F1E6}\x{1F1E7}B RegionalIndicator RegionalIndicator
|
||||
0: \x{1f1e6}\x{1f1e7}
|
||||
\x{261D}\x{E0100}\x{1F3FB}B E_Base Extend E_Modifier
|
||||
0: \x{261d}\x{e0100}\x{1f3fb}
|
||||
\x{261D}\x{1F3FB}\x{261d}B Extended_Pictographic Extend E-P
|
||||
0: \x{261d}\x{1f3fb}\x{261d}
|
||||
\x{261D}\x{1F3FB}\x{200d}\x{261d}B Extended_Pictographic Extend ZWJ E-P
|
||||
0: \x{261d}\x{1f3fb}\x{200d}\x{261d}
|
||||
|
||||
# Regional indicators
|
||||
|
||||
|
@ -4700,5 +4711,43 @@ Callout 0: last capture = 1
|
|||
1: \x{1f1e6}\x{1f1e7}
|
||||
2: \x{1f1e7}\x{1f1e6}
|
||||
|
||||
# More differences from Perl
|
||||
|
||||
/^[\p{Arabic}]/utf
|
||||
\= Expect no match
|
||||
\x{650}
|
||||
No match
|
||||
\x{651}
|
||||
No match
|
||||
\x{652}
|
||||
No match
|
||||
\x{653}
|
||||
No match
|
||||
\x{654}
|
||||
No match
|
||||
\x{655}
|
||||
No match
|
||||
|
||||
/^\p{Common}/utf
|
||||
\x{589}
|
||||
0: \x{589}
|
||||
\x{60c}
|
||||
0: \x{60c}
|
||||
\x{61f}
|
||||
0: \x{61f}
|
||||
\x{964}
|
||||
0: \x{964}
|
||||
\x{965}
|
||||
0: \x{965}
|
||||
|
||||
/^\p{Inherited}/utf
|
||||
\x{64b}
|
||||
0: \x{64b}
|
||||
\x{654}
|
||||
0: \x{654}
|
||||
\x{655}
|
||||
0: \x{655}
|
||||
\x{1D1AA}
|
||||
0: \x{1d1aa}
|
||||
|
||||
# End of testinput5
|
||||
|
|
Loading…
Reference in New Issue