[doc] Talk less about “complex” scripts
Use more neutral terms and don’t make it like some scripts are outliers.
This commit is contained in:
parent
bd44840fab
commit
8d36300154
|
@ -419,7 +419,7 @@
|
||||||
<section id="reordering-in-levels-0-and-1">
|
<section id="reordering-in-levels-0-and-1">
|
||||||
<title>Reordering in levels 0 and 1</title>
|
<title>Reordering in levels 0 and 1</title>
|
||||||
<para>
|
<para>
|
||||||
Another common operation in the more complex shapers is glyph
|
Another common operation in some shapers is glyph
|
||||||
reordering. In order to maintain a monotonic cluster sequence
|
reordering. In order to maintain a monotonic cluster sequence
|
||||||
when glyph reordering takes place, HarfBuzz merges the clusters
|
when glyph reordering takes place, HarfBuzz merges the clusters
|
||||||
of everything in the reordering sequence.
|
of everything in the reordering sequence.
|
||||||
|
|
|
@ -117,7 +117,7 @@
|
||||||
implements separate shapers for Indic, Arabic, Thai and
|
implements separate shapers for Indic, Arabic, Thai and
|
||||||
Lao, Khmer, Myanmar, Tibetan, Hangul, Hebrew, the
|
Lao, Khmer, Myanmar, Tibetan, Hangul, Hebrew, the
|
||||||
Universal Shaping Engine (USE), and a default shaper for
|
Universal Shaping Engine (USE), and a default shaper for
|
||||||
non-complex scripts.
|
scripts with no script-specific shaping model.
|
||||||
</para>
|
</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
</varlistentry>
|
</varlistentry>
|
||||||
|
|
|
@ -65,7 +65,7 @@
|
||||||
</para>
|
</para>
|
||||||
<para>
|
<para>
|
||||||
The algorithms
|
The algorithms
|
||||||
used for complex scripts can be quite involved; HarfBuzz tries
|
used for shaping can be quite involved; HarfBuzz tries
|
||||||
to be compatible with the OpenType Layout specification
|
to be compatible with the OpenType Layout specification
|
||||||
and, wherever there is any ambiguity, HarfBuzz attempts to replicate the
|
and, wherever there is any ambiguity, HarfBuzz attempts to replicate the
|
||||||
output of Microsoft's Uniscribe engine. See the <ulink
|
output of Microsoft's Uniscribe engine. See the <ulink
|
||||||
|
@ -131,7 +131,7 @@
|
||||||
</para>
|
</para>
|
||||||
<para>
|
<para>
|
||||||
Some OpenType features are defined for the purpose of supporting
|
Some OpenType features are defined for the purpose of supporting
|
||||||
complex-script shaping, and are automatically activated, but
|
script-specific shaping, and are automatically activated, but
|
||||||
only when a buffer's script property is set to a script that the
|
only when a buffer's script property is set to a script that the
|
||||||
feature supports.
|
feature supports.
|
||||||
</para>
|
</para>
|
||||||
|
|
|
@ -22,7 +22,7 @@
|
||||||
correct amount for each successive glyph.
|
correct amount for each successive glyph.
|
||||||
</para>
|
</para>
|
||||||
<para>
|
<para>
|
||||||
But, for <emphasis>complex scripts</emphasis>, any combination of
|
But, for other scripts (often unceremoniously called <emphasis>complex scripts</emphasis>), any combination of
|
||||||
several shaping operations may be required, and the rules for how
|
several shaping operations may be required, and the rules for how
|
||||||
and when they are applied vary from script to script. HarfBuzz and
|
and when they are applied vary from script to script. HarfBuzz and
|
||||||
other shaping engines implement these rules.
|
other shaping engines implement these rules.
|
||||||
|
@ -36,42 +36,35 @@
|
||||||
</para>
|
</para>
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
<section id="complex-scripts">
|
<section id="script-specific-shaping">
|
||||||
<title>Complex scripts</title>
|
<title>Script-specific shaping</title>
|
||||||
<para>
|
<para>
|
||||||
In text-shaping terminology, scripts are generally classified as
|
In many scripts, transforming the input
|
||||||
either <emphasis>complex</emphasis> or <emphasis>non-complex</emphasis>.
|
sequence into the final layout often requires some combination of
|
||||||
</para>
|
|
||||||
<para>
|
|
||||||
Complex scripts are those for which transforming the input
|
|
||||||
sequence into the final layout requires some combination of
|
|
||||||
operations—such as context-dependent substitutions,
|
operations—such as context-dependent substitutions,
|
||||||
context-dependent mark positioning, glyph-to-glyph joining,
|
context-dependent mark positioning, glyph-to-glyph joining,
|
||||||
glyph reordering, or glyph stacking.
|
glyph reordering, or glyph stacking.
|
||||||
</para>
|
</para>
|
||||||
<para>
|
<para>
|
||||||
In some complex scripts, the shaping rules require that a text
|
In some scripts, the shaping rules require that a text
|
||||||
run be divided into syllables before the operations can be
|
run be divided into syllables before the operations can be
|
||||||
applied. Other complex scripts may apply shaping operations over
|
applied. Other scripts may apply shaping operations over
|
||||||
entire words or over the entire text run, with no subdivision
|
entire words or over the entire text run, with no subdivision
|
||||||
required.
|
required.
|
||||||
</para>
|
</para>
|
||||||
<para>
|
<para>
|
||||||
Non-complex scripts, by definition, do not require these
|
Other scripts, do not require these
|
||||||
operations. However, correctly shaping a text run in a
|
operations. However, correctly shaping a text run in
|
||||||
non-complex script may still involve Unicode normalization,
|
any script may still involve Unicode normalization,
|
||||||
ligature substitutions, mark positioning, kerning, and applying
|
ligature substitutions, mark positioning, kerning, and applying
|
||||||
other font features. The key difference is that a text run in a
|
other font features.
|
||||||
non-complex script can be processed sequentially and in the same
|
|
||||||
order as the input sequence of Unicode codepoints, without
|
|
||||||
requiring an analysis stage.
|
|
||||||
</para>
|
</para>
|
||||||
</section>
|
</section>
|
||||||
|
|
||||||
<section id="shaping-operations">
|
<section id="shaping-operations">
|
||||||
<title>Shaping operations</title>
|
<title>Shaping operations</title>
|
||||||
<para>
|
<para>
|
||||||
Shaping a complex-script text run involves transforming the
|
Shaping a text run involves transforming the
|
||||||
input sequence of Unicode codepoints with some combination of
|
input sequence of Unicode codepoints with some combination of
|
||||||
operations that is specified in the shaping model for the
|
operations that is specified in the shaping model for the
|
||||||
script.
|
script.
|
||||||
|
@ -81,7 +74,7 @@
|
||||||
text run varies from script to script, as do the order that the
|
text run varies from script to script, as do the order that the
|
||||||
operations are performed in and which codepoints are
|
operations are performed in and which codepoints are
|
||||||
affected. However, the same general set of shaping operations is
|
affected. However, the same general set of shaping operations is
|
||||||
common to all of the complex-script shaping models.
|
common to all of the script shaping models.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<itemizedlist>
|
<itemizedlist>
|
||||||
|
@ -92,7 +85,7 @@
|
||||||
some other ("visual") position.
|
some other ("visual") position.
|
||||||
</para>
|
</para>
|
||||||
<para>
|
<para>
|
||||||
The shaping model for a given complex script might involve
|
The shaping model for a given script might involve
|
||||||
more than one reordering step.
|
more than one reordering step.
|
||||||
</para>
|
</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
|
@ -119,7 +112,7 @@
|
||||||
particular string pattern.
|
particular string pattern.
|
||||||
</para>
|
</para>
|
||||||
<para>
|
<para>
|
||||||
The shaping model for a given complex script might involve
|
The shaping model for a given script might involve
|
||||||
multiple contextual-substitution operations, each applying
|
multiple contextual-substitution operations, each applying
|
||||||
to different target glyphs and patterns, and which are
|
to different target glyphs and patterns, and which are
|
||||||
performed in separate steps.
|
performed in separate steps.
|
||||||
|
@ -138,7 +131,7 @@
|
||||||
Many contextual positioning operations are used to place
|
Many contextual positioning operations are used to place
|
||||||
<emphasis>mark</emphasis> glyphs (such as diacritics, vowel
|
<emphasis>mark</emphasis> glyphs (such as diacritics, vowel
|
||||||
signs, and tone markers) with respect to
|
signs, and tone markers) with respect to
|
||||||
<emphasis>base</emphasis> glyphs. However, some complex
|
<emphasis>base</emphasis> glyphs. However, some
|
||||||
scripts may use contextual positioning operations to
|
scripts may use contextual positioning operations to
|
||||||
correctly place base glyphs as well, such as
|
correctly place base glyphs as well, such as
|
||||||
when the script uses <emphasis>stacking</emphasis> characters.
|
when the script uses <emphasis>stacking</emphasis> characters.
|
||||||
|
@ -194,7 +187,7 @@
|
||||||
multiple positions).
|
multiple positions).
|
||||||
</para>
|
</para>
|
||||||
<para>
|
<para>
|
||||||
Some complex scripts require that the text run be split into
|
Some scripts require that the text run be split into
|
||||||
syllables. What constitutes a valid syllable in these
|
syllables. What constitutes a valid syllable in these
|
||||||
scripts is specified in regular expressions, formed from the
|
scripts is specified in regular expressions, formed from the
|
||||||
Letter and Mark codepoints, that take the UISC and UIPC
|
Letter and Mark codepoints, that take the UISC and UIPC
|
||||||
|
@ -235,7 +228,7 @@
|
||||||
<listitem>
|
<listitem>
|
||||||
<para>
|
<para>
|
||||||
The <emphasis>default</emphasis> shaping model handles all
|
The <emphasis>default</emphasis> shaping model handles all
|
||||||
non-complex scripts, and may also be used as a fallback for
|
scripts with no script-specific shaping model, and may also be used as a fallback for
|
||||||
handling unrecognized scripts.
|
handling unrecognized scripts.
|
||||||
</para>
|
</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
|
@ -310,7 +303,7 @@
|
||||||
<listitem>
|
<listitem>
|
||||||
<para>
|
<para>
|
||||||
The <emphasis>Universal Shaping Engine</emphasis> (USE)
|
The <emphasis>Universal Shaping Engine</emphasis> (USE)
|
||||||
shaping model supports complex scripts not covered by one of
|
shaping model supports scripts not covered by one of
|
||||||
the above, script-specific shaping models, including
|
the above, script-specific shaping models, including
|
||||||
Javanese, Balinese, Buginese, Batak, Chakma, Lepcha, Modi,
|
Javanese, Balinese, Buginese, Batak, Chakma, Lepcha, Modi,
|
||||||
Phags-pa, Tagalog, Siddham, Sundanese, Tai Le, Tai Tham, Tai
|
Phags-pa, Tagalog, Siddham, Sundanese, Tai Le, Tai Tham, Tai
|
||||||
|
|
Loading…
Reference in New Issue