Docs: update usermanual What Is HarfBuzz material.

This commit is contained in:
Nathan Willis 2018-10-10 16:37:29 -05:00 committed by Khaled Hosny
parent 0956ab4185
commit 088755f9e6
1 changed files with 173 additions and 49 deletions

View File

@ -11,8 +11,8 @@
</para> </para>
<para> <para>
HarfBuzz can properly shape all of the world's major writing HarfBuzz can properly shape all of the world's major writing
systems. It runs on virtually all operating systems and software systems. It runs on all major operating systems and software
platforms, and it supports all of the standard font formats in use platforms, and it supports all of the modern font formats in use
today. today.
</para> </para>
<section id="what-is-text-shaping"> <section id="what-is-text-shaping">
@ -41,9 +41,7 @@
<para>The dominant format is <ulink <para>The dominant format is <ulink
url="http://www.microsoft.com/typography/otspec/">OpenType</ulink>. The url="http://www.microsoft.com/typography/otspec/">OpenType</ulink>. The
OpenType specification defines a series of <ulink url="https://github.com/n8willis/opentype-shaping-documents">shaping models</ulink> for OpenType specification defines a series of <ulink url="https://github.com/n8willis/opentype-shaping-documents">shaping models</ulink> for
various scripts (including Indic, Arabic, Hangul, Hebrew, Khmer, various scripts from around the world. These shaping models depend on
Myanmar, Thai and Lao, Tibetan, and a Universal Shaping Engine
designed to cover other scripts). These shaping models depend on
the font including certain features in its <literal>GSUB</literal> the font including certain features in its <literal>GSUB</literal>
and <literal>GPOS</literal> tables. and <literal>GPOS</literal> tables.
</para> </para>
@ -55,11 +53,12 @@
TrueType fonts can also include OpenType shaping TrueType fonts can also include OpenType shaping
features. Alternatively, TrueType fonts can also include <ulink url="https://developer.apple.com/fonts/TrueType-Reference-Manual/RM09/AppendixF.html">Apple features. Alternatively, TrueType fonts can also include <ulink url="https://developer.apple.com/fonts/TrueType-Reference-Manual/RM09/AppendixF.html">Apple
Advanced Typography</ulink> (AAT) tables to implement shaping Advanced Typography</ulink> (AAT) tables to implement shaping
support. AAT fonts are generally only found on macOS systems. support. AAT fonts are generally only found on macOS and iOS systems.
</para> </para>
<para> <para>
Text strings will usually be tagged with a script and language Text strings will usually be tagged with a script and language
tag that provide the context for text shaping. <ulink tag that provide the context needed to perform text shaping
correctly. The necessary <ulink
url="https://docs.microsoft.com/en-us/typography/opentype/spec/scripttags">Script</ulink> url="https://docs.microsoft.com/en-us/typography/opentype/spec/scripttags">Script</ulink>
and <ulink and <ulink
url="https://docs.microsoft.com/en-us/typography/opentype/spec/languagetags">language</ulink> url="https://docs.microsoft.com/en-us/typography/opentype/spec/languagetags">language</ulink>
@ -72,24 +71,25 @@
<para> <para>
Text shaping is an integral part of preparing text for Text shaping is an integral part of preparing text for
display. Before a Unicode sequence can be rendered, the display. Before a Unicode sequence can be rendered, the
codepoints in the sequence must be mapped to the glyphs codepoints in the sequence must be mapped to the corresponding
provided in the font, and the glyphs must be positioned glyphs provided in the font, and those glyphs must be positioned
correctly relative to each other. For many of the scripts correctly relative to each other. For many of the scripts
supported in Unicode, these steps involve script-specific layout supported in Unicode, these steps involve script-specific layout
rules. rules, including complex joining, reordering, and positioning
behavior. Implementing these rules is the job of the shaping engine.
</para> </para>
<para> <para>
Text shaping is a fairly low-level operation. HarfBuzz is Text shaping is a fairly low-level operation. HarfBuzz is
used directly by graphic rendering libraries such as Pango, as used directly by graphical rendering libraries like <ulink
well as by the layout engines in Firefox, LibreOffice, and url="https://www.pango.org/">Pango</a>, as well as by the layout
Chromium. Unless you are <emphasis>writing</emphasis> one of engines in Firefox, LibreOffice, and Chromium. Unless you are
these layout engines yourself, you will probably not need to use <emphasis>writing</emphasis> one of these layout engines
HarfBuzz: normally, lower-level libraries will turn text into yourself, you will probably not need to use HarfBuzz: normally,
glyphs for you. lower-level libraries will turn text into glyphs for you.
</para> </para>
<para> <para>
However, if you <emphasis>are</emphasis> writing a layout engine However, if you <emphasis>are</emphasis> writing a layout engine
or graphics library yourself, you will need to perform text or graphics library yourself, then you will need to perform text
shaping, and this is where HarfBuzz can help you. shaping, and this is where HarfBuzz can help you.
</para> </para>
<para> <para>
@ -104,14 +104,15 @@
all other symbols), which are indexed by a <literal>glyph ID</literal>. all other symbols), which are indexed by a <literal>glyph ID</literal>.
</para> </para>
<para> <para>
The glyph ID within the font does not necessarily correlate A particular glyph ID within the font does not necessarily
to a predictable Unicode codepoint. For instance, some fonts correlate to a predictable Unicode codepoint. For instance,
have the letter &quot;a&quot; as glyph ID 1, but many others do some fonts have the letter &quot;a&quot; as glyph ID 1, but
not. To pull the right glyph out of the font in order to many others do not. In order to retrieve the right glyph
display &quot;a&quot;, you need to consult the table inside from the font to display &quot;a&quot;, you need to consult
the font (the <literal>cmap</literal> table) that maps Unicode the table inside the font (the <literal>cmap</literal>
codepoints to glyph IDs. In other words, <emphasis>text shaping turns table) that maps Unicode codepoints to glyph IDs. In other
codepoints into glyph IDs</emphasis>. words, <emphasis>text shaping turns codepoints into glyph
IDs</emphasis>.
</para> </para>
</listitem> </listitem>
<listitem> <listitem>
@ -125,7 +126,7 @@
<para> <para>
Whether you should render an &quot;f, i&quot; sequence Whether you should render an &quot;f, i&quot; sequence
as <literal>fi</literal> or as &quot;&quot; does not as <literal>fi</literal> or as &quot;&quot; does not
depend on the input text. Rather, it depends on the whether depend on the input text. Instead, it depends on the whether
or not the font includes an &quot;&quot; glyph and on the or not the font includes an &quot;&quot; glyph and on the
level of ligature application you wish to perform. The font level of ligature application you wish to perform. The font
and the amount of ligature application used are under your and the amount of ligature application used are under your
@ -195,26 +196,148 @@
right position, you need to consult the table inside right position, you need to consult the table inside
the font (the <literal>GPOS</literal> table) that contains the font (the <literal>GPOS</literal> table) that contains
positioning information. positioning information.
In other words, <emphasis>text shaping tells you whether you have a In other words, <emphasis>text shaping tells you whether you
precomposed glyph within your font or if you need to compose a have a precomposed glyph within your font or if you need to
glyph yourself out of combining marks&mdash;and, if so, where to compose a glyph yourself out of combining marks&mdash;and,
position those marks.</emphasis> if so, where to position those marks.</emphasis>
</para> </para>
</listitem> </listitem>
</itemizedlist> </itemizedlist>
<para> <para>
If tasks like these are something that you need to do, then you need a text If tasks like these are something that you need to do, then you
shaping engine. You could use Uniscribe if you are writing need a text shaping engine. You could use Uniscribe if you are
Windows software; you could use CoreText on macOS; or you could writing Windows software; you could use CoreText on macOS; or
use HarfBuzz. you could use HarfBuzz.
</para> </para>
<note>
<para>
In the rest of this manual, the text will assume that the reader
is that implementor of a text-layout engine.
</para>
</note>
</section>
<section>
<title>What does HarfBuzz do?</title>
<para> <para>
In the rest of this manual, we are going to assume that you are the HarfBuzz provides OpenType text shaping through a cross-platform
implementor of a text-layout engine. C API that accepts sequences of Unicode input text. Currently,
the following OpenType shaping models are supported:
</para>
<itemizedlist>
<listitem>
<para>
Indic (covering Devanagari, Bengali, Gujarati,
Gurmukhi, Kannada, Malayalam, Oriya, Tamil, Telugu, and
Sinhala)
</para>
</listitem>
<listitem>
<para>
Arabic (covering Arabic, N'Ko, Syriac, and Mongolian)
</para>
</listitem>
<listitem>
<para>
Thai and Lao
</para>
</listitem>
<listitem>
<para>
Khmer
</para>
</listitem>
<listitem>
<para>
Myanmar
</para>
</listitem>
<listitem>
<para>
Tibetan
</para>
</listitem>
<listitem>
<para>
Hangul
</para>
</listitem>
<listitem>
<para>
Hebrew
</para>
</listitem>
<listitem>
<para>
The Universal Shaping Engine or <emphasis>USE</emphasis>
(covering complex scripts not covered by the above shaping
models)
</para>
</listitem>
<listitem>
<para>
A default shaping model for non-complex scripts
(covering Latin, Cyrillic, Greek, Armenian, Georgian, Tifinagh,
and many others)
</para>
</listitem>
<listitem>
<para>
Emoji (including emoji modifier sequences, flag sequences,
and ZWJ sequences)
</para>
</listitem>
</itemizedlist>
<para>
In addition to OpenType shaping, HarfBuzz supports the latest
version of Graphite shaping. HarfBuzz currently supports AAT
shaping only on macOS and iOS systems, and in a pass-through
fashion: HarfBuzz hands off AAT support to the system CoreText
library. However, full, built-in AAT support within HarfBuzz is
under development.
</para>
<para>
HarfBuzz can read and understand TrueType fonts (.ttf), TrueType
collections (.ttc), and OpenType fonts (.otf, including those
fonts that contain TrueType-style outlines and those that
contain PostScript CFF or CFF2 outlines).
</para>
<para>
HarfBuzz can run on top of the FreeType, CoreText, DirectWrite,
or Uniscribe font renderers.
</para>
<para>
In addition to its core shaping functionality, HarfBuzz provides
functions for accessing other font features, including optional
GSUB and GPOS OpenType features, as well as
all color-font formats (<literal>CBDT</literal>,
<literal>sbix</literal>, <literal>COLR/CPAL</literal>, and
<literal>SVG-OT</literal>) and OpenType variable fonts. HarfBuzz
also includes a font-subsetting feature.
</para>
<para>
HarfBuzz can perform some low-level math-shaping operations,
although it does not currently perform full shaping for
mathematical typesetting.
</para>
<para>
A suite of command-line utilities is also provided in the
source-code tree, designed to help users test and debug
HarfBuzz's features on real-world fonts and input.
</para> </para>
</section> </section>
<section id="what-harfbuzz-doesnt-do"> <section id="what-harfbuzz-doesnt-do">
<title>What HarfBuzz doesn't do</title> <title>What HarfBuzz doesn't do</title>
<para> <para>
HarfBuzz will take a Unicode string, shape it, and give you the HarfBuzz will take a Unicode string, shape it, and give you the
@ -223,7 +346,7 @@
extent of HarfBuzz's responsibility. extent of HarfBuzz's responsibility.
</para> </para>
<para> <para>
It is important to note that if you are implementing a It is important to note that if you are implementing a complete
text-layout engine you may have other responsibilities that text-layout engine you may have other responsibilities that
HarfBuzz will <emphasis>not</emphasis> help you with. For example: HarfBuzz will <emphasis>not</emphasis> help you with. For example:
</para> </para>
@ -239,13 +362,13 @@
sequence: sequence:
</para> </para>
<programlisting> <programlisting>
A B C [space] ג ב א [space] D E F A B C [space] ג ב א [space] D E F
</programlisting> </programlisting>
<para> <para>
but will expect to see in the output: but will expect to see in the output:
</para> </para>
<programlisting> <programlisting>
ABC אבג DEF ABC אבג DEF
</programlisting> </programlisting>
<para> <para>
This reordering is called <emphasis>bidi processing</emphasis> This reordering is called <emphasis>bidi processing</emphasis>
@ -253,8 +376,9 @@ ABC אבג DEF
algorithm as an annex to the Unicode Standard which tells you how algorithm as an annex to the Unicode Standard which tells you how
to reorder a string from logical order into presentation order. to reorder a string from logical order into presentation order.
Before sending your string to HarfBuzz, you may need to apply the Before sending your string to HarfBuzz, you may need to apply the
bidi algorithm to it. Libraries such as ICU and fribidi can do bidi algorithm to it. Libraries such as <ulink
this for you. url="http://icu-project.org/">ICU</ulink> and <ulink
url="http://fribidi.org/">fribidi</a> can do this for you.
</para> </para>
</listitem> </listitem>
<listitem> <listitem>
@ -312,9 +436,9 @@ ABC אבג DEF
project (and you will see references to the FreeType authors project (and you will see references to the FreeType authors
within the source code copyright declarations), but was then within the source code copyright declarations), but was then
extracted out to its own project. This project is maintained by extracted out to its own project. This project is maintained by
Behdad Esfahbod, and named HarfBuzz. Originally, it was a shaping Behdad Esfahbod, who named it HarfBuzz. Originally, it was a
engine for OpenType fonts&mdash;&quot;HarfBuzz&quot; is the Persian shaping engine for OpenType fonts&mdash;&quot;HarfBuzz&quot; is
for &quot;open type&quot;. the Persian for &quot;open type&quot;.
</para> </para>
</section> </section>
</chapter> </chapter>