[doc] Talk less about “complex” scripts

Use more neutral terms and don’t make it like some scripts are outliers.
This commit is contained in:
Khaled Hosny 2022-06-03 21:00:08 +02:00 committed by Behdad Esfahbod
parent bd44840fab
commit 8d36300154
4 changed files with 23 additions and 30 deletions

View File

@ -419,7 +419,7 @@
<section id="reordering-in-levels-0-and-1"> <section id="reordering-in-levels-0-and-1">
<title>Reordering in levels 0 and 1</title> <title>Reordering in levels 0 and 1</title>
<para> <para>
Another common operation in the more complex shapers is glyph Another common operation in some shapers is glyph
reordering. In order to maintain a monotonic cluster sequence reordering. In order to maintain a monotonic cluster sequence
when glyph reordering takes place, HarfBuzz merges the clusters when glyph reordering takes place, HarfBuzz merges the clusters
of everything in the reordering sequence. of everything in the reordering sequence.

View File

@ -117,7 +117,7 @@
implements separate shapers for Indic, Arabic, Thai and implements separate shapers for Indic, Arabic, Thai and
Lao, Khmer, Myanmar, Tibetan, Hangul, Hebrew, the Lao, Khmer, Myanmar, Tibetan, Hangul, Hebrew, the
Universal Shaping Engine (USE), and a default shaper for Universal Shaping Engine (USE), and a default shaper for
non-complex scripts. scripts with no script-specific shaping model.
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>

View File

@ -65,7 +65,7 @@
</para> </para>
<para> <para>
The algorithms The algorithms
used for complex scripts can be quite involved; HarfBuzz tries used for shaping can be quite involved; HarfBuzz tries
to be compatible with the OpenType Layout specification to be compatible with the OpenType Layout specification
and, wherever there is any ambiguity, HarfBuzz attempts to replicate the and, wherever there is any ambiguity, HarfBuzz attempts to replicate the
output of Microsoft's Uniscribe engine. See the <ulink output of Microsoft's Uniscribe engine. See the <ulink
@ -131,7 +131,7 @@
</para> </para>
<para> <para>
Some OpenType features are defined for the purpose of supporting Some OpenType features are defined for the purpose of supporting
complex-script shaping, and are automatically activated, but script-specific shaping, and are automatically activated, but
only when a buffer's script property is set to a script that the only when a buffer's script property is set to a script that the
feature supports. feature supports.
</para> </para>

View File

@ -22,7 +22,7 @@
correct amount for each successive glyph. correct amount for each successive glyph.
</para> </para>
<para> <para>
But, for <emphasis>complex scripts</emphasis>, any combination of But, for other scripts (often unceremoniously called <emphasis>complex scripts</emphasis>), any combination of
several shaping operations may be required, and the rules for how several shaping operations may be required, and the rules for how
and when they are applied vary from script to script. HarfBuzz and and when they are applied vary from script to script. HarfBuzz and
other shaping engines implement these rules. other shaping engines implement these rules.
@ -36,42 +36,35 @@
</para> </para>
</section> </section>
<section id="complex-scripts"> <section id="script-specific-shaping">
<title>Complex scripts</title> <title>Script-specific shaping</title>
<para> <para>
In text-shaping terminology, scripts are generally classified as In many scripts, transforming the input
either <emphasis>complex</emphasis> or <emphasis>non-complex</emphasis>. sequence into the final layout often requires some combination of
</para>
<para>
Complex scripts are those for which transforming the input
sequence into the final layout requires some combination of
operations&mdash;such as context-dependent substitutions, operations&mdash;such as context-dependent substitutions,
context-dependent mark positioning, glyph-to-glyph joining, context-dependent mark positioning, glyph-to-glyph joining,
glyph reordering, or glyph stacking. glyph reordering, or glyph stacking.
</para> </para>
<para> <para>
In some complex scripts, the shaping rules require that a text In some scripts, the shaping rules require that a text
run be divided into syllables before the operations can be run be divided into syllables before the operations can be
applied. Other complex scripts may apply shaping operations over applied. Other scripts may apply shaping operations over
entire words or over the entire text run, with no subdivision entire words or over the entire text run, with no subdivision
required. required.
</para> </para>
<para> <para>
Non-complex scripts, by definition, do not require these Other scripts, do not require these
operations. However, correctly shaping a text run in a operations. However, correctly shaping a text run in
non-complex script may still involve Unicode normalization, any script may still involve Unicode normalization,
ligature substitutions, mark positioning, kerning, and applying ligature substitutions, mark positioning, kerning, and applying
other font features. The key difference is that a text run in a other font features.
non-complex script can be processed sequentially and in the same
order as the input sequence of Unicode codepoints, without
requiring an analysis stage.
</para> </para>
</section> </section>
<section id="shaping-operations"> <section id="shaping-operations">
<title>Shaping operations</title> <title>Shaping operations</title>
<para> <para>
Shaping a complex-script text run involves transforming the Shaping a text run involves transforming the
input sequence of Unicode codepoints with some combination of input sequence of Unicode codepoints with some combination of
operations that is specified in the shaping model for the operations that is specified in the shaping model for the
script. script.
@ -81,7 +74,7 @@
text run varies from script to script, as do the order that the text run varies from script to script, as do the order that the
operations are performed in and which codepoints are operations are performed in and which codepoints are
affected. However, the same general set of shaping operations is affected. However, the same general set of shaping operations is
common to all of the complex-script shaping models. common to all of the script shaping models.
</para> </para>
<itemizedlist> <itemizedlist>
@ -92,7 +85,7 @@
some other ("visual") position. some other ("visual") position.
</para> </para>
<para> <para>
The shaping model for a given complex script might involve The shaping model for a given script might involve
more than one reordering step. more than one reordering step.
</para> </para>
</listitem> </listitem>
@ -119,7 +112,7 @@
particular string pattern. particular string pattern.
</para> </para>
<para> <para>
The shaping model for a given complex script might involve The shaping model for a given script might involve
multiple contextual-substitution operations, each applying multiple contextual-substitution operations, each applying
to different target glyphs and patterns, and which are to different target glyphs and patterns, and which are
performed in separate steps. performed in separate steps.
@ -138,7 +131,7 @@
Many contextual positioning operations are used to place Many contextual positioning operations are used to place
<emphasis>mark</emphasis> glyphs (such as diacritics, vowel <emphasis>mark</emphasis> glyphs (such as diacritics, vowel
signs, and tone markers) with respect to signs, and tone markers) with respect to
<emphasis>base</emphasis> glyphs. However, some complex <emphasis>base</emphasis> glyphs. However, some
scripts may use contextual positioning operations to scripts may use contextual positioning operations to
correctly place base glyphs as well, such as correctly place base glyphs as well, such as
when the script uses <emphasis>stacking</emphasis> characters. when the script uses <emphasis>stacking</emphasis> characters.
@ -194,7 +187,7 @@
multiple positions). multiple positions).
</para> </para>
<para> <para>
Some complex scripts require that the text run be split into Some scripts require that the text run be split into
syllables. What constitutes a valid syllable in these syllables. What constitutes a valid syllable in these
scripts is specified in regular expressions, formed from the scripts is specified in regular expressions, formed from the
Letter and Mark codepoints, that take the UISC and UIPC Letter and Mark codepoints, that take the UISC and UIPC
@ -235,7 +228,7 @@
<listitem> <listitem>
<para> <para>
The <emphasis>default</emphasis> shaping model handles all The <emphasis>default</emphasis> shaping model handles all
non-complex scripts, and may also be used as a fallback for scripts with no script-specific shaping model, and may also be used as a fallback for
handling unrecognized scripts. handling unrecognized scripts.
</para> </para>
</listitem> </listitem>
@ -310,7 +303,7 @@
<listitem> <listitem>
<para> <para>
The <emphasis>Universal Shaping Engine</emphasis> (USE) The <emphasis>Universal Shaping Engine</emphasis> (USE)
shaping model supports complex scripts not covered by one of shaping model supports scripts not covered by one of
the above, script-specific shaping models, including the above, script-specific shaping models, including
Javanese, Balinese, Buginese, Batak, Chakma, Lepcha, Modi, Javanese, Balinese, Buginese, Batak, Chakma, Lepcha, Modi,
Phags-pa, Tagalog, Siddham, Sundanese, Tai Le, Tai Tham, Tai Phags-pa, Tagalog, Siddham, Sundanese, Tai Le, Tai Tham, Tai