Docs: update Usermanual-What Is HarfBuzz.

2018-09-28 16:07:37 -05:00 · 2018-09-28 16:07:37 -05:00 · d9fd927210
parent 0af3d176a6
commit d9fd927210
1 changed files with 130 additions and 69 deletions
--- a/docs/usermanual-what-is-harfbuzz.xml
+++ b/docs/usermanual-what-is-harfbuzz.xml
@ -1,115 +1,176 @@
 <chapter id="what-is-harfbuzz">
  <title>What is HarfBuzz?</title>
  <para>
-    HarfBuzz is a <emphasis>text shaping engine</emphasis>. It solves
-    the problem of selecting and positioning glyphs from a font given a
-    Unicode string.
+    HarfBuzz is a <emphasis>text shaping engine</emphasis>. If you
+    give HarfBuzz a font and a string containing a sequence of Unicode
+    codepoints, HarfBuzz selects and positions the corresponding
+    glyphs from the font, applying all of the necessary layout rules
+    and font features. HarfBuzz then returns the string to you in the
+    form that is correctly arranged for the language and writing
+    system.
  </para>
-  <section id="why-do-i-need-it">
-    <title>Why do I need it?</title>
+  <para>
+    HarfBuzz can properly shape all of the world's major writing
+    systems. It runs on virtually all operating systems and software
+    platforms, and it supports all of the standard font formats in use
+    today.
+  </para>
+  <section id="why-do-i-need-a-shaping-engine">
+    <title>Why do I need a shaping engine?</title>
    <para>
-      Text shaping is an integral part of preparing text for display. It
-      is a fairly low level operation; HarfBuzz is used directly by
-      graphic rendering libraries such as Pango, and the layout engines
-      in Firefox, LibreOffice and Chromium. Unless you are
-      <emphasis>writing</emphasis> one of these layout engines yourself,
-      you will probably not need to use HarfBuzz - normally higher level
-      libraries will turn text into glyphs for you.
+      Text shaping is an integral part of preparing text for
+      display. Before a Unicode sequence can be rendered, the
+      codepoints in the sequence must be mapped to the glyphs
+      provided in the font, and the glyphs must be positioned
+      correctly relative to each other. For many of the scripts
+      supported in Unicode, these steps involve script-specific layout
+      rules.
+    </para>
+    <para>
+      Text shaping is a fairly low-level operation. HarfBuzz is
+      used directly by graphic rendering libraries such as Pango, as
+      well as by the layout engines in Firefox, LibreOffice, and
+      Chromium. Unless you are <emphasis>writing</emphasis> one of
+      these layout engines yourself, you will probably not need to use
+      HarfBuzz: normally, lower-level libraries will turn text into
+      glyphs for you.
    </para>
    <para>
      However, if you <emphasis>are</emphasis> writing a layout engine
      or graphics library yourself, you will need to perform text
-      shaping, and this is where HarfBuzz can help you. Here are some
-      reasons why you need it:
+      shaping, and this is where HarfBuzz can help you.
+    </para>
+    <para>
+      Here are some specific scenarios where a text-shaping engine
+      like HarfBuzz helps you:
    </para>
    <itemizedlist>
      <listitem>
        <para>
-          OpenType fonts contain a set of glyphs, indexed by glyph ID.
-          The glyph ID within the font does not necessarily relate to a
-          Unicode codepoint. For instance, some fonts have the letter
-          &quot;a&quot; as glyph ID 1. To pull the right glyph out of
-          the font in order to display it, you need to consult a table
-          within the font (the &quot;cmap&quot; table) which maps
-          Unicode codepoints to glyph IDs. Text shaping turns codepoints
-          into glyph IDs.
+          OpenType fonts contain a set of glyphs (that is, shapes
+	  to represent the letters, numbers, punctuation marks, and
+	  all other symbols), which are indexed by a <literal>glyph ID</literal>.
+	</para>
+	<para>
+          The glyph ID within the font does not necessarily correlate
+	  to a predictable Unicode codepoint. For instance, some fonts
+	  have the letter &quot;a&quot; as glyph ID 1, but many others do
+	  not. To pull the right glyph out of the font in order to
+	  display &quot;a&quot;, you need to consult the table inside
+	  the font (the <literal>cmap</literal> table) that maps Unicode
+	  codepoints to glyph IDs. In other words, <emphasis>text shaping turns
+	  codepoints into glyph IDs</emphasis>.
        </para>
      </listitem>
      <listitem>
        <para>
          Many OpenType fonts contain ligatures: combinations of
-          characters which are rendered together. For instance, it's
-          common for the <literal>fi</literal> combination to appear in
-          print as the single ligature &quot;ﬁ&quot;. Whether you should
-          render text as <literal>fi</literal> or &quot;ﬁ&quot; does not
-          depend on the input text, but on the capabilities of the font
-          and the level of ligature application you wish to perform.
-          Text shaping involves querying the font's ligature tables and
-          determining what substitutions should be made.
+          characters that are rendered as a single unit. For instance,
+	  it is common for the <literal>fi</literal> letter
+	  combination to appear in print as the single ligature glyph
+	  &quot;ﬁ&quot;.
+	</para>
+	<para>
+	  Whether you should render an &quot;f, i&quot; sequence
+	  as <literal>fi</literal> or as &quot;ﬁ&quot; does not
+          depend on the input text. Rather, it depends on the whether
+	  or not the font includes an &quot;ﬁ&quot; glyph and on the
+	  level of ligature application you wish to perform. The font
+	  and the amount of ligature application used are under your
+	  control. In other words, <emphasis>text shaping involves
+	  querying the font's ligature tables and determining what
+	  substitutions should be made</emphasis>. 
        </para>
      </listitem>
      <listitem>
        <para>
-          While ligatures like &quot;ﬁ&quot; are typographic
-          refinements, some languages <emphasis>require</emphasis> such
+          While ligatures like &quot;ﬁ&quot; are optional typographic
+          refinements, some languages <emphasis>require</emphasis> certain
          substitutions to be made in order to display text correctly.
-          In Tamil, when the letter &quot;TTA&quot; (ட) letter is
-          followed by &quot;U&quot; (உ), the combination should appear
-          as the single glyph &quot;டு&quot;. The sequence of Unicode
-          characters &quot;டஉ&quot; needs to be rendered as a single
-          glyph from the font - text shaping chooses the correct glyph
-          from the sequence of characters provided.
+        </para>
+	<para>
+	  For example, in Tamil, when the letter &quot;TTA&quot; (ட)
+	  letter is followed by &quot;U&quot; (உ), the pair
+	  must be replaced by the single glyph &quot;டு&quot;. The
+	  sequence of Unicode characters &quot;டஉ&quot; needs to be
+	  substituted with a single &quot;டு&quot; glyph from the
+	  font.
+	</para>
+	<para>
+	  But &quot;டு&quot; does not have a Unicode codepoint. To
+	  find this glyph, you need to consult the table inside 
+	  the font (the <literal>GSUB</literal> table) that contains
+	  substitution information. In other words, <emphasis>text shaping 
+	  chooses the correct glyph for a sequence of characters
+	  provided</emphasis>.
        </para>
      </listitem>
      <listitem>
        <para>
-          Similarly, each Arabic character has four different variants:
-          within a font, there will be glyphs for the initial, medial,
-          final, and isolated forms of each letter. Unicode only encodes
-          one codepoint per character, and so a Unicode string will not
-          tell you which glyph to use. Text shaping chooses the correct
-          form of the letter and returns the correct glyph from the font
-          that you need to render.
+          Similarly, each Arabic character has four different variants
+	  corresponding to the different positions in might appear in
+	  within a sequence. Inside a font, there will be separate
+	  glyphs for the initial, medial, final, and isolated forms of
+	  each letter, each at a different glyph ID.
+	</para>
+	<para>
+	  Unicode only assigns one codepoint per character, so a
+	  Unicode string will not tell you which glyph variant to use
+	  for each character. To decide, you need to analyze the whole
+	  string and determine the appropriate glyph for each character
+	  based on its position. In other words, <emphasis>text
+	  shaping chooses the correct form of the letter by its
+	  position and returns the correct glyph from the font</emphasis>.
        </para>
      </listitem>
      <listitem>
        <para>
-          Other languages have marks and accents which need to be
-          rendered in certain positions around a base character. For
-          instance, the Moldovan language has the Cyrillic letter
-          &quot;zhe&quot; (ж) with a breve accent, like so: ӂ. Some
-          fonts will contain this character as an individual glyph,
-          whereas other fonts will not contain a zhe-with-breve glyph
-          but expect the rendering engine to form the character by
-          overlaying the two glyphs ж and ˘. Where you should draw the
-          combining breve depends on the height of the preceding glyph.
-          Again, for Arabic, the correct positioning of vowel marks
-          depends on the height of the character on which you are
-          placing the mark. Text shaping tells you whether you have a
+          Other languages involve marks and accents that need to be
+          rendered in specific positions relative a base character. For
+          instance, the Moldovan language includes the Cyrillic letter
+          &quot;zhe&quot; (ж) with a breve accent, like so: &quot;ӂ&quot;.
+	</para>
+	<para>
+	  Some fonts will provide this character as a single
+	  zhe-with-breve glyph, but other fonts will not and, instead,
+	  will expect the rendering engine to form the character by 
+          superimposing the separate &quot;ж&quot; and &quot;˘&quot;
+	  glyphs.
+	</para>
+	<para>
+	  But exactly where you should draw the breve depends on the
+	  height and width of the preceding zhe glyph. To find the
+	  right position, you need to consult the table inside
+	  the font (the <literal>GPOS</literal> table) that contains
+	  positioning information.
+          In other words, <emphasis>text shaping tells you whether you have a
          precomposed glyph within your font or if you need to compose a
-          glyph yourself out of combining marks, and if so, where to
-          position those marks.
+          glyph yourself out of combining marks&mdash;and, if so, where to
+          position those marks.</emphasis>
        </para>
      </listitem>
    </itemizedlist>
    <para>
-      If this is something that you need to do, then you need a text
-      shaping engine: you could use Uniscribe if you are using Windows;
-      you could use CoreText on OS X; or you could use HarfBuzz. In the
-      rest of this manual, we are going to assume that you are the
-      implementor of a text layout engine.
+      If tasks like these are something that you need to do, then you need a text
+      shaping engine. You could use Uniscribe if you are writing
+      Windows software; you could use CoreText on macOS; or you could
+      use HarfBuzz.
+    </para>
+    <para>
+      In the rest of this manual, we are going to assume that you are the
+      implementor of a text-layout engine.
    </para>
  </section>
  <section id="why-is-it-called-harfbuzz">
    <title>Why is it called HarfBuzz?</title>
    <para>
-      HarfBuzz began its life as text shaping code within the FreeType
-      project, (and you will see references to the FreeType authors
-      within the source code copyright declarations) but was then
-      abstracted out to its own project. This project is maintained by
+      HarfBuzz began its life as text-shaping code within the FreeType
+      project (and you will see references to the FreeType authors
+      within the source code copyright declarations), but was then
+      extracted out to its own project. This project is maintained by
      Behdad Esfahbod, and named HarfBuzz. Originally, it was a shaping
      engine for OpenType fonts - &quot;HarfBuzz&quot; is the Persian
      for &quot;open type&quot;.
    </para>
  </section>
-</chapter>
+</chapter>