Merge pull request #1691 from n8willis/usermanual-shaping
Usermanual: Add new chapters.
This commit is contained in:
commit
e7ed85de95
|
@ -79,9 +79,9 @@ content_files= \
|
|||
usermanual-object-model.xml \
|
||||
usermanual-buffers-language-script-and-direction.xml \
|
||||
usermanual-fonts-and-faces.xml \
|
||||
usermanual-clusters.xml \
|
||||
usermanual-opentype-features.xml \
|
||||
usermanual-glyph-information.xml \
|
||||
usermanual-clusters.xml \
|
||||
usermanual-utilities.xml \
|
||||
version.xml
|
||||
|
||||
# SGML files where gtk-doc abbrevations (#GtkWidget) are expanded
|
||||
|
|
|
@ -36,9 +36,9 @@
|
|||
<xi:include href="usermanual-object-model.xml"/>
|
||||
<xi:include href="usermanual-buffers-language-script-and-direction.xml"/>
|
||||
<xi:include href="usermanual-fonts-and-faces.xml"/>
|
||||
<xi:include href="usermanual-clusters.xml"/>
|
||||
<xi:include href="usermanual-opentype-features.xml"/>
|
||||
<xi:include href="usermanual-glyph-information.xml"/>
|
||||
<xi:include href="usermanual-clusters.xml"/>
|
||||
<xi:include href="usermanual-utilities.xml"/>
|
||||
</part>
|
||||
|
||||
<part>
|
||||
|
|
|
@ -7,29 +7,37 @@
|
|||
<chapter id="buffers-language-script-and-direction">
|
||||
<title>Buffers, language, script and direction</title>
|
||||
<para>
|
||||
The input to HarfBuzz is a series of Unicode characters, stored in a
|
||||
The input to the HarfBuzz shaper is a series of Unicode characters, stored in a
|
||||
buffer. In this chapter, we'll look at how to set up a buffer with
|
||||
the text that we want and then customize the properties of the
|
||||
buffer.
|
||||
the text that we want and how to customize the properties of the
|
||||
buffer. We'll also look at a piece of lower-level machinery that
|
||||
you will need to understand before proceeding: the functions that
|
||||
HarfBuzz uses to retrieve Unicode information.
|
||||
</para>
|
||||
<para>
|
||||
After shaping is complete, HarfBuzz puts its output back
|
||||
into the buffer. But getting that output requires setting up a
|
||||
face and a font first, so we will look at that in the next chapter
|
||||
instead of here.
|
||||
</para>
|
||||
<section id="creating-and-destroying-buffers">
|
||||
<title>Creating and destroying buffers</title>
|
||||
<para>
|
||||
As we saw in our <emphasis>Getting Started</emphasis> example, a
|
||||
buffer is created and
|
||||
initialized with <literal>hb_buffer_create()</literal>. This
|
||||
initialized with <function>hb_buffer_create()</function>. This
|
||||
produces a new, empty buffer object, instantiated with some
|
||||
default values and ready to accept your Unicode strings.
|
||||
</para>
|
||||
<para>
|
||||
HarfBuzz manages the memory of objects (such as buffers) that it
|
||||
creates, so you don't have to. When you have finished working on
|
||||
a buffer, you can call <literal>hb_buffer_destroy()</literal>:
|
||||
a buffer, you can call <function>hb_buffer_destroy()</function>:
|
||||
</para>
|
||||
<programlisting language="C">
|
||||
hb_buffer_t *buffer = hb_buffer_create();
|
||||
hb_buffer_t *buf = hb_buffer_create();
|
||||
...
|
||||
hb_buffer_destroy(buffer);
|
||||
hb_buffer_destroy(buf);
|
||||
</programlisting>
|
||||
<para>
|
||||
This will destroy the object and free its associated memory -
|
||||
|
@ -39,46 +47,364 @@
|
|||
else destroying it, you should increase its reference count:
|
||||
</para>
|
||||
<programlisting language="C">
|
||||
void somefunc(hb_buffer_t *buffer) {
|
||||
buffer = hb_buffer_reference(buffer);
|
||||
void somefunc(hb_buffer_t *buf) {
|
||||
buf = hb_buffer_reference(buf);
|
||||
...
|
||||
</programlisting>
|
||||
<para>
|
||||
And then decrease it once you're done with it:
|
||||
</para>
|
||||
<programlisting language="C">
|
||||
hb_buffer_destroy(buffer);
|
||||
hb_buffer_destroy(buf);
|
||||
}
|
||||
</programlisting>
|
||||
<para>
|
||||
While we are on the subject of reference-counting buffers, it is
|
||||
worth noting that an individual buffer can only meaningfully be
|
||||
used by one thread at a time.
|
||||
</para>
|
||||
<para>
|
||||
To throw away all the data in your buffer and start from scratch,
|
||||
call <literal>hb_buffer_reset(buffer)</literal>. If you want to
|
||||
call <function>hb_buffer_reset(buf)</function>. If you want to
|
||||
throw away the string in the buffer but keep the options, you can
|
||||
instead call <literal>hb_buffer_clear_contents(buffer)</literal>.
|
||||
instead call <function>hb_buffer_clear_contents(buf)</function>.
|
||||
</para>
|
||||
</section>
|
||||
|
||||
<section id="adding-text-to-the-buffer">
|
||||
<title>Adding text to the buffer</title>
|
||||
<para>
|
||||
Now we have a brand new HarfBuzz buffer. Let's start filling it
|
||||
with text! From HarfBuzz's perspective, a buffer is just a stream
|
||||
of Unicode code points, but your input string is probably in one of
|
||||
the standard Unicode character encodings (UTF-8, UTF-16, UTF-32)
|
||||
the standard Unicode character encodings (UTF-8, UTF-16, or
|
||||
UTF-32). HarfBuzz provides convenience functions that accept
|
||||
each of these encodings:
|
||||
<function>hb_buffer_add_utf8()</function>,
|
||||
<function>hb_buffer_add_utf16()</function>, and
|
||||
<function>hb_buffer_add_utf32()</function>. Other than the
|
||||
character encoding they accept, they function identically.
|
||||
</para>
|
||||
<para>
|
||||
You can add UTF-8 text to a buffer by passing in the text array,
|
||||
the array's length, an offset into the array for the first
|
||||
character to add, and the length of the segment to add:
|
||||
</para>
|
||||
<programlisting language="C">
|
||||
hb_buffer_add_utf8 (hb_buffer_t *buf,
|
||||
const char *text,
|
||||
int text_length,
|
||||
unsigned int item_offset,
|
||||
int item_length)
|
||||
</programlisting>
|
||||
<para>
|
||||
So, in practice, you can say:
|
||||
</para>
|
||||
<programlisting language="C">
|
||||
hb_buffer_add_utf8(buf, text, strlen(text), 0, strlen(text));
|
||||
</programlisting>
|
||||
<para>
|
||||
This will append your new characters to
|
||||
<parameter>buf</parameter>, not replace its existing
|
||||
contents. Also, note that you can use <literal>-1</literal> in
|
||||
place of the first instance of <function>strlen(text)</function>
|
||||
if your text array is NULL-terminated. Similarly, you can also use
|
||||
<literal>-1</literal> as the final argument want to add its full
|
||||
contents.
|
||||
</para>
|
||||
<para>
|
||||
Whatever start <parameter>item_offset</parameter> and
|
||||
<parameter>item_length</parameter> you provide, HarfBuzz will also
|
||||
attempt to grab the five characters <emphasis>before</emphasis>
|
||||
the offset point and the five characters
|
||||
<emphasis>after</emphasis> the designated end. These are the
|
||||
before and after "context" segments, which are used internally
|
||||
for HarfBuzz to make shaping decisions. They will not be part of
|
||||
the final output, but they ensure that HarfBuzz's
|
||||
script-specific shaping operations are correct. If there are
|
||||
fewer than five characters available for the before or after
|
||||
contexts, HarfBuzz will just grab what is there.
|
||||
</para>
|
||||
<para>
|
||||
For longer text runs, such as full paragraphs, it might be
|
||||
tempting to only add smaller sub-segments to a buffer and
|
||||
shape them in piecemeal fashion. Generally, this is not a good
|
||||
idea, however, because a lot of shaping decisions are
|
||||
dependent on this context information. For example, in Arabic
|
||||
and other connected scripts, HarfBuzz needs to know the code
|
||||
points before and after each character in order to correctly
|
||||
determine which glyph to return.
|
||||
</para>
|
||||
<para>
|
||||
The safest approach is to add all of the text available, then
|
||||
use <parameter>item_offset</parameter> and
|
||||
<parameter>item_length</parameter> to indicate which characters you
|
||||
want shaped, so that HarfBuzz has access to any context.
|
||||
</para>
|
||||
<para>
|
||||
You can also add Unicode code points directly with
|
||||
<function>hb_buffer_add_codepoints()</function>. The arguments
|
||||
to this function are the same as those for the UTF
|
||||
encodings. But it is particularly important to note that
|
||||
HarfBuzz does not do validity checking on the text that is added
|
||||
to a buffer. Invalid code points will be replaced, but it is up
|
||||
to you to do any deep-sanity checking necessary.
|
||||
</para>
|
||||
|
||||
</section>
|
||||
|
||||
<section id="setting-buffer-properties">
|
||||
<title>Setting buffer properties</title>
|
||||
<para>
|
||||
Buffers containing input characters still need several
|
||||
properties set before HarfBuzz can shape their text correctly.
|
||||
</para>
|
||||
</section>
|
||||
<section id="what-about-the-other-scripts">
|
||||
<title>What about the other scripts?</title>
|
||||
<para>
|
||||
Initially, all buffers are set to the
|
||||
<literal>HB_BUFFER_CONTENT_TYPE_INVALID</literal> content
|
||||
type. After adding text, the buffer should be set to
|
||||
<literal>HB_BUFFER_CONTENT_TYPE_UNICODE</literal> instead, which
|
||||
indicates that it contains un-shaped input
|
||||
characters. After shaping, the buffer will have the
|
||||
<literal>HB_BUFFER_CONTENT_TYPE_GLYPHS</literal> content type.
|
||||
</para>
|
||||
<para>
|
||||
<function>hb_buffer_add_utf8()</function> and the
|
||||
other UTF functions set the content type of their buffer
|
||||
automatically. But if you are reusing a buffer you may want to
|
||||
check its state with
|
||||
<function>hb_buffer_get_content_type(buffer)</function>. If
|
||||
necessary you can set the content type with
|
||||
</para>
|
||||
<programlisting language="C">
|
||||
hb_buffer_set_content_type(buf, HB_BUFFER_CONTENT_TYPE_UNICODE);
|
||||
</programlisting>
|
||||
<para>
|
||||
to prepare for shaping.
|
||||
</para>
|
||||
<para>
|
||||
Buffers also need to carry information about the script,
|
||||
language, and text direction of their contents. You can set
|
||||
these properties individually:
|
||||
</para>
|
||||
<programlisting language="C">
|
||||
hb_buffer_set_direction(buf, HB_DIRECTION_LTR);
|
||||
hb_buffer_set_script(buf, HB_SCRIPT_LATIN);
|
||||
hb_buffer_set_language(buf, hb_language_from_string("en", -1));
|
||||
</programlisting>
|
||||
<para>
|
||||
However, since these properties are often the repeated for
|
||||
multiple text runs, you can also save them in a
|
||||
<literal>hb_segment_properties_t</literal> for reuse:
|
||||
</para>
|
||||
<programlisting language="C">
|
||||
hb_segment_properties_t *savedprops;
|
||||
hb_buffer_get_segment_properties (buf, savedprops);
|
||||
...
|
||||
hb_buffer_set_segment_properties (buf2, savedprops);
|
||||
</programlisting>
|
||||
<para>
|
||||
HarfBuzz also provides getter functions to retrieve a buffer's
|
||||
direction, script, and language properties individually.
|
||||
</para>
|
||||
<para>
|
||||
HarfBuzz recognizes four text directions in
|
||||
<type>hb_direction_t</type>: left-to-right
|
||||
(<literal>HB_DIRECTION_LTR</literal>), right-to-left (<literal>HB_DIRECTION_RTL</literal>),
|
||||
top-to-bottom (<literal>HB_DIRECTION_TTB</literal>), and
|
||||
bottom-to-top (<literal>HB_DIRECTION_BTT</literal>). For the
|
||||
script property, HarfBuzz uses identifiers based on the
|
||||
<ulink
|
||||
url="https://unicode.org/iso15924/">ISO 15924
|
||||
standard</ulink>. For languages, HarfBuzz uses tags based on the
|
||||
<ulink url="https://tools.ietf.org/html/bcp47">IETF BCP 47</ulink> standard.
|
||||
</para>
|
||||
<para>
|
||||
Helper functions are provided to convert character strings into
|
||||
the necessary script and language tag types.
|
||||
</para>
|
||||
<para>
|
||||
Two additional buffer properties to be aware of are the
|
||||
"invisible glyph" and the replacement code point. The
|
||||
replacement code point is inserted into buffer output in place of
|
||||
any invalid code points encountered in the input. By default, it
|
||||
is the Unicode <literal>REPLACEMENT CHARACTER</literal> code
|
||||
point, <literal>U+FFFD</literal> "�". You can change this with
|
||||
</para>
|
||||
<programlisting language="C">
|
||||
hb_buffer_set_replacement_codepoint(buf, replacement);
|
||||
</programlisting>
|
||||
<para>
|
||||
passing in the replacement Unicode code point as the
|
||||
<parameter>replacement</parameter> parameter.
|
||||
</para>
|
||||
<para>
|
||||
The invisible glyph is used to replace all output glyphs that
|
||||
are invisible. By default, the standard space character
|
||||
<literal>U+0020</literal> is used; you can replace this (for
|
||||
example, when using a font that provides script-specific
|
||||
spaces) with
|
||||
</para>
|
||||
<programlisting language="C">
|
||||
hb_buffer_set_invisible_glyph(buf, replacement_glyph);
|
||||
</programlisting>
|
||||
<para>
|
||||
Do note that in the <parameter>replacement_glyph</parameter>
|
||||
parameter, you must provide the glyph ID of the replacement you
|
||||
wish to use, not the Unicode code point.
|
||||
</para>
|
||||
<para>
|
||||
HarfBuzz supports a few additional flags you might want to set
|
||||
on your buffer under certain circumstances. The
|
||||
<literal>HB_BUFFER_FLAG_BOT</literal> and
|
||||
<literal>HB_BUFFER_FLAG_EOT</literal> flags tell HarfBuzz
|
||||
that the buffer represents the beginning or end (respectively)
|
||||
of a text element (such as a paragraph or other block). Knowing
|
||||
this allows HarfBuzz to apply certain contextual font features
|
||||
when shaping, such as initial or final variants in connected
|
||||
scripts.
|
||||
</para>
|
||||
<para>
|
||||
<literal>HB_BUFFER_FLAG_PRESERVE_DEFAULT_IGNORABLES</literal>
|
||||
tells HarfBuzz not to hide glyphs with the
|
||||
<literal>Default_Ignorable</literal> property in Unicode. This
|
||||
property designates control characters and other non-printing
|
||||
code points, such as joiners and variation selectors. Normally
|
||||
HarfBuzz replaces them in the output buffer with zero-width
|
||||
space glyphs (using the "invisible glyph" property discussed
|
||||
above); setting this flag causes them to be printed, which can
|
||||
be helpful for troubleshooting.
|
||||
</para>
|
||||
<para>
|
||||
Conversely, setting the
|
||||
<literal>HB_BUFFER_FLAG_REMOVE_DEFAULT_IGNORABLES</literal> flag
|
||||
tells HarfBuzz to remove <literal>Default_Ignorable</literal>
|
||||
glyphs from the output buffer entirely. Finally, setting the
|
||||
<literal>HB_BUFFER_FLAG_DO_NOT_INSERT_DOTTED_CIRCLE</literal>
|
||||
flag tells HarfBuzz not to insert the dotted-circle glyph
|
||||
(<literal>U+25CC</literal>, "◌"), which is normally
|
||||
inserted into buffer output when broken character sequences are
|
||||
encountered (such as combining marks that are not attached to a
|
||||
base character).
|
||||
</para>
|
||||
</section>
|
||||
|
||||
<section id="customizing-unicode-functions">
|
||||
<title>Customizing Unicode functions</title>
|
||||
<para>
|
||||
HarfBuzz requires some simple functions for accessing
|
||||
information from the Unicode Character Database (such as the
|
||||
<literal>General_Category</literal> (gc) and
|
||||
<literal>Script</literal> (sc) properties) that is useful
|
||||
for shaping, as well as some useful operations like composing and
|
||||
decomposing code points.
|
||||
</para>
|
||||
<para>
|
||||
HarfBuzz includes its own internal, lightweight set of Unicode
|
||||
functions. At build time, it is also possible to compile support
|
||||
for some other options, such as the Unicode functions provided
|
||||
by GLib or the International Components for Unicode (ICU)
|
||||
library. Generally, this option is only of interest for client
|
||||
programs that have specific integration requirements or that do
|
||||
a significant amount of customization.
|
||||
</para>
|
||||
<para>
|
||||
If your program has access to other Unicode functions, however,
|
||||
such as through a system library or application framework, you
|
||||
might prefer to use those instead of the built-in
|
||||
options. HarfBuzz supports this by implementing its Unicode
|
||||
functions as a set of virtual methods that you can replace —
|
||||
without otherwise affecting HarfBuzz's functionality.
|
||||
</para>
|
||||
<para>
|
||||
The Unicode functions are specified in a structure called
|
||||
<literal>unicode_funcs</literal> which is attached to each
|
||||
buffer. But even though <literal>unicode_funcs</literal> is
|
||||
associated with a <type>hb_buffer_t</type>, the functions
|
||||
themselves are called by other HarfBuzz APIs that access
|
||||
buffers, so it would be unwise for you to hook different
|
||||
functions into different buffers.
|
||||
</para>
|
||||
<para>
|
||||
In addition, you can mark your <literal>unicode_funcs</literal>
|
||||
as immutable by calling
|
||||
<function>hb_unicode_funcs_make_immutable (ufuncs)</function>.
|
||||
This is especially useful if your code is a
|
||||
library or framework that will have its own client programs. By
|
||||
marking your Unicode function choices as immutable, you prevent
|
||||
your own client programs from changing the
|
||||
<literal>unicode_funcs</literal> configuration and introducing
|
||||
inconsistencies and errors downstream.
|
||||
</para>
|
||||
<para>
|
||||
You can retrieve the Unicode-functions configuration for
|
||||
your buffer by calling <function>hb_buffer_get_unicode_funcs()</function>:
|
||||
</para>
|
||||
<programlisting language="C">
|
||||
hb_unicode_funcs_t *ufunctions;
|
||||
ufunctions = hb_buffer_get_unicode_funcs(buf);
|
||||
</programlisting>
|
||||
<para>
|
||||
The current version of <literal>unicode_funcs</literal> uses six functions:
|
||||
</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
<function>hb_unicode_combining_class_func_t</function>:
|
||||
returns the Canonical Combining Class of a code point.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<function>hb_unicode_general_category_func_t</function>:
|
||||
returns the General Category (gc) of a code point.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<function>hb_unicode_mirroring_func_t</function>: returns
|
||||
the Mirroring Glyph code point (for bi-directional
|
||||
replacement) of a code point.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<function>hb_unicode_script_func_t</function>: returns the
|
||||
Script (sc) property of a code point.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<function>hb_unicode_compose_func_t</function>: returns the
|
||||
canonical composition of a sequence of two code points.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<function>hb_unicode_decompose_func_t</function>: returns
|
||||
the canonical decomposition of a code point.
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
<para>
|
||||
Note, however, that future HarfBuzz releases may alter this set.
|
||||
</para>
|
||||
<para>
|
||||
Each Unicode function has a corresponding setter, with which you
|
||||
can assign a callback to your replacement function. For example,
|
||||
to replace
|
||||
<function>hb_unicode_general_category_func_t</function>, you can call
|
||||
</para>
|
||||
<programlisting language="C">
|
||||
hb_unicode_funcs_set_general_category_func (*ufuncs, func, *user_data, destroy)
|
||||
</programlisting>
|
||||
<para>
|
||||
Virtualizing this set of Unicode functions is primarily intended
|
||||
to improve portability. There is no need for every client
|
||||
program to make the effort to replace the default options, so if
|
||||
you are unsure, do not feel any pressure to customize
|
||||
<literal>unicode_funcs</literal>.
|
||||
</para>
|
||||
</section>
|
||||
|
||||
</chapter>
|
||||
|
|
|
@ -5,20 +5,449 @@
|
|||
<!ENTITY version SYSTEM "version.xml">
|
||||
]>
|
||||
<chapter id="fonts-and-faces">
|
||||
<title>Fonts and faces</title>
|
||||
<section id="using-freetype">
|
||||
<title>Fonts, faces, and output</title>
|
||||
<para>
|
||||
In the previous chapter, we saw how to set up a buffer and fill
|
||||
it with text as Unicode code points. In order to shape this
|
||||
buffer text with HarfBuzz, you will need also need a font
|
||||
object.
|
||||
</para>
|
||||
<para>
|
||||
HarfBuzz provides abstractions to help you cache and reuse the
|
||||
heavier parts of working with binary fonts, so we will look at
|
||||
how to do that. We will also look at how to work with the
|
||||
FreeType font-rendering library and at how you can customize
|
||||
HarfBuzz to work with other libraries.
|
||||
</para>
|
||||
<para>
|
||||
Finally, we will look at how to work with OpenType variable
|
||||
fonts, the latest update to the OpenType font format, and at
|
||||
some other recent additions to OpenType.
|
||||
</para>
|
||||
|
||||
<section id="fonts-and-faces-objects">
|
||||
<title>Font and face objects</title>
|
||||
<para>
|
||||
The outcome of shaping a run of text depends on the contents of
|
||||
a specific font file (such as the substitutions and positioning
|
||||
moves in the 'GSUB' and 'GPOS' tables), so HarfBuzz makes
|
||||
accessing those internals fast.
|
||||
</para>
|
||||
<para>
|
||||
An <type>hb_face_t</type> represents a <emphasis>face</emphasis>
|
||||
in HarfBuzz. This data type is a wrapper around an
|
||||
<type>hb_blob_t</type> blob that holds the contents of a binary
|
||||
fotn file. Since HarfBuzz supports TrueType Collections and
|
||||
OpenType Collections (each of which can include multiple
|
||||
typefaces), a HarfBuzz face also requires an index number
|
||||
specifying which typeface in the file you want to use. Most of
|
||||
the font files you will encounter in the wild include just a
|
||||
single face, however, so most of the time you would pass in
|
||||
<literal>0</literal> as the index when you create a face:
|
||||
</para>
|
||||
<programlisting language="C">
|
||||
hb_blob_t* blob = hb_blob_create_from_file(file);
|
||||
...
|
||||
hb_face_t* face = hb_face_create(blob, 0);
|
||||
</programlisting>
|
||||
<para>
|
||||
On its own, a face object is not quite ready to use for
|
||||
shaping. The typeface must be set to a specific point size in
|
||||
order for some details (such as hinting) to work. In addition,
|
||||
if the font file in question is an OpenType Variable Font, then
|
||||
you may need to specify one or variation-axis settings (or a
|
||||
named instance) in order to get the output you need.
|
||||
</para>
|
||||
<para>
|
||||
In HarfBuzz, you do this by creating a <emphasis>font</emphasis>
|
||||
object from your face.
|
||||
</para>
|
||||
<para>
|
||||
Font objects also have the advantage of being considerably
|
||||
lighter-weight than face objects (remember that a face contains
|
||||
the contents of a binary font file mapped into memory). As a
|
||||
result, you can cache and reuse a font object, but you could
|
||||
also create a new one for each additional size you needed.
|
||||
Creating new fonts incurs some additional overhead, of course,
|
||||
but whether or not it is excessive is your call in the end. In
|
||||
contrast, face objects are substantially larger, and you really
|
||||
should cache them and reuse them whenever possible.
|
||||
</para>
|
||||
<para>
|
||||
You can create a font object from a face object:
|
||||
</para>
|
||||
<programlisting language="C">
|
||||
hb_font_t* hb_font = hb_font_create(hb_face);
|
||||
</programlisting>
|
||||
<para>
|
||||
After creating a font, there are a few properties you should
|
||||
set. Many fonts enable and disable hints based on the size it
|
||||
is used at, so setting this is important for font
|
||||
objects. <function>hb_font_set_ppem(font, x_ppem,
|
||||
y_ppem)</function> sets the pixels-per-EM value of the font. You
|
||||
can also set the point size of the font with
|
||||
<function>hb_font_set_ppem(font, ptem)</function>. HarfBuzz uses the
|
||||
industry standard 72 points per inch.
|
||||
</para>
|
||||
<para>
|
||||
HarfBuzz lets you specify the degree subpixel precision you want
|
||||
through a scaling factor. You can set horizontal and
|
||||
vertical scaling factors on the
|
||||
font by calling <function>hb_font_set_scale(font, x_scale,
|
||||
y_scale)</function>.
|
||||
</para>
|
||||
<para>
|
||||
There may be times when you are handed a font object and need to
|
||||
access the face object that it comes from. For that, you can call
|
||||
</para>
|
||||
<programlisting language="C">
|
||||
hb_face = hb_font_get_face(hb_font);
|
||||
</programlisting>
|
||||
<para>
|
||||
You can also create a font object from an existing font object
|
||||
using the <function>hb_font_create_sub_font()</function>
|
||||
function. This creates a child font object that is initiated
|
||||
with the same attributes as its parent; it can be used to
|
||||
quickly set up a new font for the purpose of overriding a specific
|
||||
font-functions method.
|
||||
</para>
|
||||
<para>
|
||||
All face objects and font objects are lifecycle-managed by
|
||||
HarfBuzz. After creating a face, you increase its reference
|
||||
count with <function>hb_face_reference(face)</function> and
|
||||
decrease it with
|
||||
<function>hb_face_destroy(face)</function>. Likewise, you
|
||||
increase the reference count on a font with
|
||||
<function>hb_font_reference(font)</function> and decrease it
|
||||
with <function>hb_font_destroy(font)</function>.
|
||||
</para>
|
||||
<para>
|
||||
You can also attach user data to face objects and font objects.
|
||||
</para>
|
||||
</section>
|
||||
|
||||
<section id="fonts-and-faces-custom-functions">
|
||||
<title>Customizing font functions</title>
|
||||
<para>
|
||||
During shaping, HarfBuzz frequently needs to query font objects
|
||||
to get at the contents and parameters of the glyphs in a font
|
||||
file. It includes a built-in set of functions that is tailored
|
||||
to working with OpenType fonts. However, as was the case with
|
||||
Unicode functions in the buffers chapter, HarfBuzz also wants to
|
||||
make it easy for you to assign a substitute set of font
|
||||
functions if you are developing a program to work with a library
|
||||
or platform that provides its own font functions.
|
||||
</para>
|
||||
<para>
|
||||
Therefore, the HarfBuzz API defines a set of virtual
|
||||
methods for accessing font-object properties, and you can
|
||||
replace the defaults with your own selections without
|
||||
interfering with the shaping process. Each font object in
|
||||
HarfBuzz includes a structure called
|
||||
<literal>font_funcs</literal> that serves as a vtable for the
|
||||
font object. The virtual methods in
|
||||
<literal>font_funcs</literal> are:
|
||||
</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
<function>hb_font_get_font_h_extents_func_t</function>: returns
|
||||
the extents of the font for horizontal text.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<function>hb_font_get_font_v_extents_func_t</function>: returns
|
||||
the extents of the font for vertical text.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<function>hb_font_get_nominal_glyph_func_t</function>: returns
|
||||
the font's nominal glyph for a given code point.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<function>hb_font_get_variation_glyph_func_t</function>: returns
|
||||
the font's glyph for a given code point when it is followed by a
|
||||
given Variation Selector.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<function>hb_font_get_nominal_glyphs_func_t</function>: returns
|
||||
the font's nominal glyphs for a series of code points.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<function>hb_font_get_glyph_advance_func_t</function>: returns
|
||||
the advance for a glyph.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<function>hb_font_get_glyph_h_advance_func_t</function>: returns
|
||||
the advance for a glyph for horizontal text.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<function>hb_font_get_glyph_v_advance_func_t</function>:returns
|
||||
the advance for a glyph for vertical text.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<function>hb_font_get_glyph_advances_func_t</function>: returns
|
||||
the advances for a series of glyphs.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<function>hb_font_get_glyph_h_advances_func_t</function>: returns
|
||||
the advances for a series of glyphs for horizontal text .
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<function>hb_font_get_glyph_v_advances_func_t</function>: returns
|
||||
the advances for a series of glyphs for vertical text.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<function>hb_font_get_glyph_origin_func_t</function>: returns
|
||||
the origin coordinates of a glyph.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<function>hb_font_get_glyph_h_origin_func_t</function>: returns
|
||||
the origin coordinates of a glyph for horizontal text.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<function>hb_font_get_glyph_v_origin_func_t</function>: returns
|
||||
the origin coordinates of a glyph for vertical text.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<function>hb_font_get_glyph_extents_func_t</function>: returns
|
||||
the extents for a glyph.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<function>hb_font_get_glyph_contour_point_func_t</function>:
|
||||
returns the coordinates of a specific contour point from a glyph.
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<function>hb_font_get_glyph_name_func_t</function>: returns the
|
||||
name of a glyph (from its glyph index).
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<function>hb_font_get_glyph_from_name_func_t</function>: returns
|
||||
the glyph index that corresponds to a given glyph name.
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
<para>
|
||||
You can fetch the font-functions configuration for a font object
|
||||
by calling <function>hb_font_get_font_funcs()</function>:
|
||||
</para>
|
||||
<programlisting language="C">
|
||||
hb_font_funcs_t *ffunctions;
|
||||
ffunctions = hb_font_get_font_funcs (font);
|
||||
</programlisting>
|
||||
<para>
|
||||
The individual methods can each be replaced with their own setter
|
||||
function, such as
|
||||
<function>hb_font_funcs_set_nominal_glyph_func(*ffunctions,
|
||||
func, *user_data, destroy)</function>.
|
||||
</para>
|
||||
<para>
|
||||
Font-functions structures can be reused for multiple font
|
||||
objects, and can be reference counted with
|
||||
<function>hb_font_funcs_reference()</function> and
|
||||
<function>hb_font_funcs_destroy()</function>. Just like other
|
||||
objects in HarfBuzz, you can set user-data for each
|
||||
font-functions structure and assign a destroy callback for
|
||||
it.
|
||||
</para>
|
||||
<para>
|
||||
You can also mark a font-functions structure as immutable,
|
||||
with <function>hb_font_funcs_make_immutable()</function>. This
|
||||
is especially useful if your code is a library or framework that
|
||||
will have its own client programs. By marking your
|
||||
font-functions structures as immutable, you prevent your client
|
||||
programs from changing the configuration and introducing
|
||||
inconsistencies and errors downstream.
|
||||
</para>
|
||||
</section>
|
||||
|
||||
<section id="fonts-and-faces-native-opentype">
|
||||
<title>Font objects and HarfBuzz's native OpenType implementation</title>
|
||||
<para>
|
||||
By default, whenever HarfBuzz creates a font object, it will
|
||||
configure the font to use a built-in set of font functions that
|
||||
supports contemporary OpenType font internals. If you want to
|
||||
work with OpenType or TrueType fonts, you should be able to use
|
||||
these functions without difficulty.
|
||||
</para>
|
||||
<para>
|
||||
Many of the methods in the font-functions structure deal with
|
||||
the fundamental properties of glyphs that are required for
|
||||
shaping text: extents (the maximums and minimums on each axis),
|
||||
origins (the <literal>(0,0)</literal> coordinate point which
|
||||
glyphs are drawn in reference to), and advances (the amount that
|
||||
the cursor needs to be moved after drawing each glyph, including
|
||||
any empty space for the glyph's side bearings).
|
||||
</para>
|
||||
<para>
|
||||
As you can see in the list of functions, there are separate "horizontal"
|
||||
and "vertical" variants depending on whether the text is set in
|
||||
the horizontal or vertical direction. For some scripts, fonts
|
||||
that are designed to support text set horizontally or vertically (for
|
||||
example, in Japanese) may include metrics for both text
|
||||
directions. When fonts don't include this information, HarfBuzz
|
||||
does its best to transform what the font provides.
|
||||
</para>
|
||||
<para>
|
||||
In addition to the direction-specific functions, HarfBuzz
|
||||
provides some higher-level functions for fetching information
|
||||
like extents and advances for a glyph. If you call
|
||||
</para>
|
||||
<programlisting language="C">
|
||||
hb_font_get_glyph_advance_for_direction(font, direction, extents);
|
||||
</programlisting>
|
||||
<para>
|
||||
then you can provide any <type>hb_direction_t</type> as the
|
||||
<parameter>direction</parameter> parameter, and HarfBuzz will
|
||||
use the correct function variant for the text direction. There
|
||||
are similar higher-level versions of the functions for fetching
|
||||
extents, origin coordinates, and contour-point
|
||||
coordinates. There are also addition and subtraction functions
|
||||
for moving points with respect to the origin.
|
||||
</para>
|
||||
<para>
|
||||
There are also methods for fetching the glyph ID that
|
||||
corresponds to a Unicode code point (possibly when followed by a
|
||||
variation-selector code point), fetching the glyph name from the
|
||||
font, and fetching the glyph ID that corresponds to a glyph name
|
||||
you already have.
|
||||
</para>
|
||||
<para>
|
||||
HarfBuzz also provides functions for converting between glyph
|
||||
names and string
|
||||
variables. <function>hb_font_glyph_to_string(font, glyph, s,
|
||||
size)</function> retrieves the name for the glyph ID
|
||||
<parameter>glyph</parameter> from the font object. It generates a
|
||||
generic name of the form <literal>gidDDD</literal> (where DDD is
|
||||
the glyph index) if there is no name for the glyph in the
|
||||
font. The <function>hb_font_glyph_from_string(font, s, len,
|
||||
glyph)</function> takes an input string <parameter>s</parameter>
|
||||
and looks for a glyph with that name in the font, returning its
|
||||
glyph ID in the <parameter>glyph</parameter>
|
||||
output parameter. It automatically parses
|
||||
<literal>gidDDD</literal> and <literal>uniUUUU</literal> strings.
|
||||
</para>
|
||||
</section>
|
||||
|
||||
|
||||
<!-- Commenting out FreeType integration section-holder for now. May move
|
||||
to the full-blown Integration Chapter. -->
|
||||
|
||||
<!-- <section id="fonts-and-faces-freetype">
|
||||
<title>Using FreeType</title>
|
||||
<para>
|
||||
|
||||
</para>
|
||||
</section>
|
||||
<section id="using-harfbuzzs-native-opentype-implementation">
|
||||
<title>Using HarfBuzz's native OpenType implementation</title>
|
||||
<para>
|
||||
|
||||
</para>
|
||||
</section>
|
||||
<section id="using-your-own-font-functions">
|
||||
<title>Using your own font functions</title>
|
||||
</section> -->
|
||||
|
||||
<section id="fonts-and-faces-variable">
|
||||
<title>Working with OpenType Variable Fonts</title>
|
||||
<para>
|
||||
If you are working with OpenType Variable Fonts, there are a few
|
||||
additional functions you should use to specify the
|
||||
variation-axis settings of your font object. Without doing so,
|
||||
your variable font's font object can still be used, but only at
|
||||
the default setting for every axis (which, of course, is
|
||||
sometimes what you want, but does not cover general usage).
|
||||
</para>
|
||||
<para>
|
||||
HarfBuzz manages variation settings in the
|
||||
<type>hb_variation_t</type> data type, which holds a <property>tag</property> for the
|
||||
variation-axis identifier tag and a <property>value</property> for its
|
||||
setting. You can retrieve the list of variation axes in a font
|
||||
binary from the face object (not from a font object, notably) by
|
||||
calling <function>hb_ot_var_get_axis_count(face)</function> to
|
||||
find the number of axes, then using
|
||||
<function>hb_ot_var_get_axis_infos()</function> to collect the
|
||||
axis structures:
|
||||
</para>
|
||||
<programlisting language="C">
|
||||
axes = hb_ot_var_get_axis_count(face);
|
||||
...
|
||||
hb_ot_var_get_axis_infos(face, 0, axes, axes_array);
|
||||
</programlisting>
|
||||
<para>
|
||||
For each axis returned in the array, you can can access the
|
||||
identifier in its <property>tag</property>. HarfBuzz also has
|
||||
tag definitions predefined for the five standard axes specified
|
||||
in OpenType (<literal>ital</literal> for italic,
|
||||
<literal>opsz</literal> for optical size,
|
||||
<literal>slnt</literal> for slant, <literal>wdth</literal> for
|
||||
width, and <literal>wght</literal> for weight). Each axis also
|
||||
has a <property>min_value</property>, a
|
||||
<property>default_value</property>, and a <property>max_value</property>.
|
||||
</para>
|
||||
<para>
|
||||
To set your font object's variation settings, you call the
|
||||
<function>hb_font_set_variations()</function> function with an
|
||||
array of <type>hb_variation_t</type> variation settings. Let's
|
||||
say our font has weight and width axes. We need to specify each
|
||||
of the axes by tag and assign a value on the axis:
|
||||
</para>
|
||||
<programlisting language="C">
|
||||
unsigned int variation_count = 2;
|
||||
hb_variation_t variation_data[variation_count];
|
||||
variation_data[0].tag = HB_OT_TAG_VAR_AXIS_WIDTH;
|
||||
variation_data[1].tag = HB_OT_TAG_VAR_AXIS_WEIGHT;
|
||||
variation_data[0].value = 80;
|
||||
variation_data[1].value = 750;
|
||||
...
|
||||
hb_font_set_variations(font, variation_data, variation_count);
|
||||
</programlisting>
|
||||
<para>
|
||||
That should give us a slightly condensed font ("normal" on the
|
||||
<literal>wdth</literal> axis is 100) at a noticeably bolder
|
||||
weight ("regular" is 400 on the <literal>wght</literal> axis).
|
||||
</para>
|
||||
<para>
|
||||
In practice, though, you should always check that the value you
|
||||
want to set on the axis is within the
|
||||
[<property>min_value</property>,<property>max_value</property>]
|
||||
range actually implemented in the font's variation axis. After
|
||||
all, a font might only provide lighter-than-regular weights, and
|
||||
setting a heavier value on the <literal>wght</literal> axis will
|
||||
not change that.
|
||||
</para>
|
||||
<para>
|
||||
Once your variation settings are specified on your font object,
|
||||
however, shaping with a variable font is just like shaping a
|
||||
static font.
|
||||
</para>
|
||||
</section>
|
||||
|
||||
</chapter>
|
||||
|
|
|
@ -6,14 +6,299 @@
|
|||
]>
|
||||
<chapter id="shaping-and-shape-plans">
|
||||
<title>Shaping and shape plans</title>
|
||||
<section id="opentype-features">
|
||||
<para>
|
||||
Once you have your face and font objects configured as desired and
|
||||
your input buffer is filled with the characters you need to shape,
|
||||
all you need to do is call <function>hb_shape()</function>.
|
||||
</para>
|
||||
<para>
|
||||
HarfBuzz will return the shaped version of the text in the same
|
||||
buffer that you provided, but it will be in output mode. At that
|
||||
point, you can iterate through the glyphs in the buffer, drawing
|
||||
each one at the specified position or handing them off to the
|
||||
appropriate graphics library.
|
||||
</para>
|
||||
<para>
|
||||
For the most part, HarfBuzz's shaping step is straightforward from
|
||||
the outside. But that doesn't mean there will never be cases where
|
||||
you want to look under the hood and see what is happening on the
|
||||
inside. HarfBuzz provides facilities for doing that, too.
|
||||
</para>
|
||||
|
||||
<section id="shaping-buffer-output">
|
||||
<title>Shaping and buffer output</title>
|
||||
<para>
|
||||
The <function>hb_shape()</function> function call takes four arguments: the font
|
||||
object to use, the buffer of characters to shape, an array of
|
||||
user-specified features to apply, and the length of that feature
|
||||
array. The feature array can be NULL, so for the sake of
|
||||
simplicity we will start with that case.
|
||||
</para>
|
||||
<para>
|
||||
Internally, HarfBuzz looks at the tables of the font file to
|
||||
determine where glyph classes, substitutions, and positioning
|
||||
are defined, using that information to decide which
|
||||
<emphasis>shaper</emphasis> to use (<literal>ot</literal> for
|
||||
OpenType fonts, <literal>aat</literal> for Apple Advanced
|
||||
Typography fonts, and so on). It also looks at the direction,
|
||||
script, and language properties of the segment to figure out
|
||||
which script-specific shaping model is needed (at least, in
|
||||
shapers that support multiple options).
|
||||
</para>
|
||||
<para>
|
||||
If a font has a GDEF table, then that is used for
|
||||
glyph classes; if not, HarfBuzz will fall back to Unicode
|
||||
categorization by code point. If a font has an AAT "morx" table,
|
||||
then it is used for substitutions; if not, but there is a GSUB
|
||||
table, then the GSUB table is used. If the font has an AAT
|
||||
"kerx" table, then it is used for positioning; if not, but
|
||||
there is a GPOS table, then the GPOS table is used. If neither
|
||||
table is found, but there is a "kern" table, then HarfBuzz will
|
||||
use the "kern" table. If there is no "kerx", no GPOS, and no
|
||||
"kern", HarfBuzz will fall back to positioning marks itself.
|
||||
</para>
|
||||
<para>
|
||||
With a well-behaved OpenType font, you expect GDEF, GSUB, and
|
||||
GPOS tables to all be applied. HarfBuzz implements the
|
||||
script-specific shaping models in internal functions, rather
|
||||
than in the public API.
|
||||
</para>
|
||||
<para>
|
||||
The algorithms
|
||||
used for complex scripts can be quite involved; HarfBuzz tries
|
||||
to be compatible with the OpenType Layout specification
|
||||
and, wherever there is any ambiguity, HarfBuzz attempts to replicate the
|
||||
output of Microsoft's Uniscribe engine. See the <ulink
|
||||
url="https://docs.microsoft.com/en-us/typography/script-development/standard">Microsoft
|
||||
Typography pages</ulink> for more detail.
|
||||
</para>
|
||||
<para>
|
||||
In general, though, all that you need to know is that
|
||||
<function>hb_shape()</function> returns the results of shaping
|
||||
in the same buffer that you provided. The buffer's content type
|
||||
will now be set to
|
||||
<literal>HB_BUFFER_CONTENT_TYPE_GLYPHS</literal>, indicating
|
||||
that it contains shaped output, rather than input text. You can
|
||||
now extract the glyph information and positioning arrays:
|
||||
</para>
|
||||
<programlisting language="C">
|
||||
hb_glyph_info_t *glyph_info = hb_buffer_get_glyph_infos(buf, &glyph_count);
|
||||
hb_glyph_position_t *glyph_pos = hb_buffer_get_glyph_positions(buf, &glyph_count);
|
||||
</programlisting>
|
||||
<para>
|
||||
The glyph information array holds a <type>hb_glyph_info_t</type>
|
||||
for each output glyph, which has two fields:
|
||||
<parameter>codepoint</parameter> and
|
||||
<parameter>cluster</parameter>. Whereas, in the input buffer,
|
||||
the <parameter>codepoint</parameter> field contained the Unicode
|
||||
code point, it now contains the glyph ID of the corresponding
|
||||
glyph in the font. The <parameter>cluster</parameter> field is
|
||||
an integer that you can use to help identify when shaping has
|
||||
reordered, split, or combined code points; we will say more
|
||||
about that in the next chapter.
|
||||
</para>
|
||||
<para>
|
||||
The glyph positions array holds a corresponding
|
||||
<type>hb_glyph_position_t</type> for each output glyph,
|
||||
containing four fields: <parameter>x_advance</parameter>,
|
||||
<parameter>y_advance</parameter>,
|
||||
<parameter>x_offset</parameter>, and
|
||||
<parameter>y_offset</parameter>. The advances tell you how far
|
||||
you need to move the drawing point after drawing this glyph,
|
||||
depending on whether you are setting horizontal text (in which
|
||||
case you will have x advances) or vertical text (for which you
|
||||
will have y advances). The x and y offsets tell you where to
|
||||
move to start drawing the glyph; usually you will have both and
|
||||
x and a y offset, regardless of the text direction.
|
||||
</para>
|
||||
<para>
|
||||
Most of the time, you will rely on a font-rendering library or
|
||||
other graphics library to do the actual drawing of glyphs, so
|
||||
you will need to iterate through the glyphs in the buffer and
|
||||
pass the corresponding values off.
|
||||
</para>
|
||||
</section>
|
||||
|
||||
<section id="shaping-opentype-features">
|
||||
<title>OpenType features</title>
|
||||
<para>
|
||||
OpenType features enable fonts to include smart behavior,
|
||||
implemented as "lookup" rules stored in the GSUB and GPOS
|
||||
tables. The OpenType specification defines a long list of
|
||||
standard features that fonts can use for these behaviors; each
|
||||
feature has a four-character reserved name and a well-defined
|
||||
semantic meaning.
|
||||
</para>
|
||||
<para>
|
||||
Some OpenType features are defined for the purpose of supporting
|
||||
complex-script shaping, and are automatically activated, but
|
||||
only when a buffer's script property is set to a script that the
|
||||
feature supports.
|
||||
</para>
|
||||
<para>
|
||||
Other features are more generic and can apply to several (or
|
||||
any) script, and shaping engines are expected to implement
|
||||
them. By default, HarfBuzz activates several of these features
|
||||
on every text run. They include <literal>ccmp</literal>,
|
||||
<literal>locl</literal>, <literal>mark</literal>,
|
||||
<literal>mkmk</literal>, and <literal>rlig</literal>.
|
||||
</para>
|
||||
<para>
|
||||
In addition, if the text direction is horizontal, HarfBuzz
|
||||
also applies the <literal>calt</literal>,
|
||||
<literal>clig</literal>, <literal>curs</literal>,
|
||||
<literal>kern</literal>, <literal>liga</literal>,
|
||||
<literal>rclt</literal>, and <literal>frac</literal> features.
|
||||
</para>
|
||||
<para>
|
||||
If the text direction is vertical, HarfBuzz applies
|
||||
the <literal>vert</literal> feature by default.
|
||||
</para>
|
||||
<para>
|
||||
Still other features are designed to be purely optional and left
|
||||
up to the application or the end user to enable or disable as desired.
|
||||
</para>
|
||||
<para>
|
||||
You can adjust the set of features that HarfBuzz applies to a
|
||||
buffer by supplying an array of <type>hb_feature_t</type>
|
||||
features as the third argument to
|
||||
<function>hb_shape()</function>. For a simple case, let's just
|
||||
enable the <literal>dlig</literal> feature, which turns on any
|
||||
"discretionary" ligatures in the font:
|
||||
</para>
|
||||
<programlisting language="C">
|
||||
hb_feature_t userfeatures[1];
|
||||
userfeatures[0].tag = HB_TAG('d','l','i','g');
|
||||
userfeatures[0].value = 1;
|
||||
userfeatures[0].start = HB_FEATURE_GLOBAL_START;
|
||||
userfeatures[0].end = HB_FEATURE_GLOBAL_END;
|
||||
</programlisting>
|
||||
<para>
|
||||
<literal>HB_FEATURE_GLOBAL_END</literal> and
|
||||
<literal>HB_FEATURE_GLOBAL_END</literal> are macros we can use
|
||||
to indicate that the features will be applied to the entire
|
||||
buffer. We could also have used a literal <literal>0</literal>
|
||||
for the start and a <literal>-1</literal> to indicate the end of
|
||||
the buffer (or have selected other start and end positions, if needed).
|
||||
</para>
|
||||
<para>
|
||||
When we pass the <varname>userfeatures</varname> array to
|
||||
<function>hb_shape()</function>, any discretionary ligature
|
||||
substitutions from our font that match the text in our buffer
|
||||
will get performed:
|
||||
</para>
|
||||
<programlisting language="C">
|
||||
hb_shape(font, buf, userfeatures, num_features);
|
||||
</programlisting>
|
||||
<para>
|
||||
Just like we enabled the <literal>dlig</literal> feature by
|
||||
setting its <parameter>value</parameter> to
|
||||
<literal>1</literal>, you would disable a feature by setting its
|
||||
<parameter>value</parameter> to <literal>0</literal>. Some
|
||||
features can take other <parameter>value</parameter> settings;
|
||||
be sure you read the full specification of each feature tag to
|
||||
understand what it does and how to control it.
|
||||
</para>
|
||||
</section>
|
||||
<section id="plans-and-caching">
|
||||
|
||||
<section id="shaping-shaper-selection">
|
||||
<title>Shaper selection</title>
|
||||
<para>
|
||||
The basic version of <function>hb_shape()</function> determines
|
||||
its shaping strategy based on examining the capabilities of the
|
||||
font file. OpenType font tables cause HarfBuzz to try the
|
||||
<literal>ot</literal> shaper, while AAT font tables cause HarfBuzz to try the
|
||||
<literal>aat</literal> shaper.
|
||||
</para>
|
||||
<para>
|
||||
In the real world, however, a font might include some unusual
|
||||
mix of tables, or one of the tables might simply be broken for
|
||||
the script you need to shape. So, sometimes, you might not
|
||||
want to rely on HarfBuzz's process for deciding what to do, and
|
||||
just tell <function>hb_shape()</function> what you want it to try.
|
||||
</para>
|
||||
<para>
|
||||
<function>hb_shape_full()</function> is an alternate shaping
|
||||
function that lets you supply a list of shapers for HarfBuzz to
|
||||
try, in order, when shaping your buffer. For example, if you
|
||||
have determined that HarfBuzz's attempts to work around broken
|
||||
tables gives you better results than the AAT shaper itself does,
|
||||
you might move the AAT shaper to the end of your list of
|
||||
preferences and call <function>hb_shape_full()</function>
|
||||
</para>
|
||||
<programlisting language="C">
|
||||
char *shaperprefs[3] = {"ot", "default", "aat"};
|
||||
...
|
||||
hb_shape_full(font, buf, userfeatures, num_features, shaperprefs);
|
||||
</programlisting>
|
||||
<para>
|
||||
to get results you are happier with.
|
||||
</para>
|
||||
<para>
|
||||
You may also want to call
|
||||
<function>hb_shape_list_shapers()</function> to get a list of
|
||||
the shapers that were built at compile time in your copy of HarfBuzz.
|
||||
</para>
|
||||
</section>
|
||||
|
||||
<section id="shaping-plans-and-caching">
|
||||
<title>Plans and caching</title>
|
||||
<para>
|
||||
Internally, HarfBuzz uses a structure called a shape plan to
|
||||
track its decisions about how to shape the contents of a
|
||||
buffer. The <function>hb_shape()</function> function builds up the shape plan by
|
||||
examining segment properties and by inspecting the contents of
|
||||
the font.
|
||||
</para>
|
||||
<para>
|
||||
This process can involve some decision-making and
|
||||
trade-offs — for example, HarfBuzz inspects the GSUB and GPOS
|
||||
lookups for the script and language tags set on the segment
|
||||
properties, but it falls back on the lookups under the
|
||||
<literal>DFLT</literal> tag (and sometimes other common tags)
|
||||
if there are actually no lookups for the tag requested.
|
||||
</para>
|
||||
<para>
|
||||
HarfBuzz also includes some work-arounds for
|
||||
handling well-known older font conventions that do not follow
|
||||
OpenType or Unicode specifications, for buggy system fonts, and for
|
||||
peculiarities of Microsoft Uniscribe. All of that means that a
|
||||
shape plan, while not something that you should edit directly in
|
||||
client code, still might be an object that you want to
|
||||
inspect. Furthermore, if resources are tight, you might want to
|
||||
cache the shape plan that HarfBuzz builds for your buffer and
|
||||
font, so that you do not have to rebuild it for every shaping call.
|
||||
</para>
|
||||
<para>
|
||||
You can create a cacheable shape plan with
|
||||
<function>hb_shape_plan_create_cached(face, props,
|
||||
user_features, num_user_features, shaper_list)</function>, where
|
||||
<parameter>face</parameter> is a face object (not a font object,
|
||||
notably), <parameter>props</parameter> is an
|
||||
<type>hb_segment_properties_t</type>,
|
||||
<parameter>user_features</parameter> is an array of
|
||||
<type>hb_feature_t</type>s (with length
|
||||
<parameter>num_user_features</parameter>), and
|
||||
<parameter>shaper_list</parameter> is a list of shapers to try.
|
||||
</para>
|
||||
<para>
|
||||
Shape plans are objects in HarfBuzz, so there are
|
||||
reference-counting functions and user-data attachment functions
|
||||
you can
|
||||
use. <function>hb_shape_plan_reference(shape_plan)</function>
|
||||
increases the reference count on a shape plan, while
|
||||
<function>hb_shape_plan_destroy(shape_plan)</function> decreases
|
||||
the reference count, destroying the shape plan when the last
|
||||
reference is dropped.
|
||||
</para>
|
||||
<para>
|
||||
You can attach user data to a shaper (with a key) using the
|
||||
<function>hb_shape_plan_set_user_data(shape_plan,key,data,destroy,replace)</function>
|
||||
function, optionally supplying a <function>destroy</function>
|
||||
callback to use. You can then fetch the user data attached to a
|
||||
shape plan with
|
||||
<function>hb_shape_plan_get_user_data(shape_plan, key)</function>.
|
||||
</para>
|
||||
</section>
|
||||
|
||||
</chapter>
|
||||
|
|
|
@ -0,0 +1,244 @@
|
|||
<?xml version="1.0"?>
|
||||
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN"
|
||||
"http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd" [
|
||||
<!ENTITY % local.common.attrib "xmlns:xi CDATA #FIXED 'http://www.w3.org/2003/XInclude'">
|
||||
<!ENTITY version SYSTEM "version.xml">
|
||||
]>
|
||||
<chapter id="utilities">
|
||||
<title>Utilities</title>
|
||||
<para>
|
||||
HarfBuzz includes several auxiliary components in addition to the
|
||||
main APIs. These include a set of command-line tools, a set of
|
||||
lower-level APIs for common data types that may be of interest to
|
||||
client programs, and an embedded library for working with
|
||||
Unicode Character Database (UCD) data.
|
||||
</para>
|
||||
|
||||
<section id="utilities-command-line-tools">
|
||||
<title>Command-line tools</title>
|
||||
<para>
|
||||
HarfBuzz include three command-line tools:
|
||||
<program>hb-shape</program>, <program>hb-view</program>, and
|
||||
<program>hb-subset</program>. They can be used to examine
|
||||
HarfBuzz's functionality, debug font binaries, or explore the
|
||||
various shaping models and features from a terminal.
|
||||
</para>
|
||||
|
||||
<section id="utilities-command-line-hbshape">
|
||||
<title>hb-shape</title>
|
||||
<para>
|
||||
<emphasis><program>hb-shape</program></emphasis> allows you to run HarfBuzz's
|
||||
<function>hb_shape()</function> function on an input string and
|
||||
to examine the outcome, in human-readable form, as terminal
|
||||
output. <program>hb-shape</program> does
|
||||
<emphasis>not</emphasis> render the results of the shaping call
|
||||
into rendered text (you can use <program>hb-view</program>, below, for
|
||||
that). Instead, it prints out the final glyph indices and
|
||||
positions, taking all shaping operations into account, as if the
|
||||
input string were a HarfBuzz input buffer.
|
||||
</para>
|
||||
<para>
|
||||
You can specify the font to be used for shaping and, with
|
||||
command-line options, you can add various aspects of the
|
||||
internal state to the output that is sent to the terminal. The
|
||||
general format is
|
||||
</para>
|
||||
<programlisting>
|
||||
<command>hb-shape</command> <optional>[OPTIONS]</optional>
|
||||
<parameter>path/to/font/file.ttf</parameter>
|
||||
<parameter>yourinputtext</parameter>
|
||||
</programlisting>
|
||||
<para>
|
||||
The default output format is plain text (although JSON output
|
||||
can be selected instead by specifying the option
|
||||
<optional>--output-format=json</optional>). The default output
|
||||
syntax reports each glyph name (or glyph index if there is no
|
||||
name) followed by its cluster value, its horizontal and vertical
|
||||
position displacement, and its horizontal and vertical advances.
|
||||
</para>
|
||||
<para>
|
||||
Output options exist to skip any of these elements in the
|
||||
output, and to include additional data, such as Unicode
|
||||
code-point values, glyph extents, glyph flags, or interim
|
||||
shaping results.
|
||||
</para>
|
||||
<para>
|
||||
Output can also be redirected to a file, or input read from a
|
||||
file. Additional options enable you to enable or disable
|
||||
specific font features, to set variation-font axis values, to
|
||||
alter the language, script, direction, and clustering settings
|
||||
used, to enable sanity checks, or to change which shaping engine is used.
|
||||
</para>
|
||||
<para>
|
||||
For a complete explanation of the options available, run
|
||||
</para>
|
||||
<programlisting>
|
||||
<command>hb-shape</command> <parameter>--help</parameter>
|
||||
</programlisting>
|
||||
</section>
|
||||
|
||||
<section id="utilities-command-line-hbview">
|
||||
<title>hb-view</title>
|
||||
<para>
|
||||
<emphasis><program>hb-view</program></emphasis> allows you to
|
||||
see the shaped output of an input string in rendered
|
||||
form. Like <program>hb-shape</program>,
|
||||
<program>hb-view</program> takes a font file and a text string
|
||||
as its arguments:
|
||||
</para>
|
||||
<programlisting>
|
||||
<command>hb-view</command> <optional>[OPTIONS]</optional>
|
||||
<parameter>path/to/font/file.ttf</parameter>
|
||||
<parameter>yourinputtext</parameter>
|
||||
</programlisting>
|
||||
<para>
|
||||
By default, <program>hb-view</program> renders the shaped
|
||||
text in ASCII block-character images as terminal output. By
|
||||
appending the
|
||||
<command>--output-file=<optional>filename</optional></command>
|
||||
switch, you can write the output to a PNG, SVG, or PDF file
|
||||
(among other formats).
|
||||
</para>
|
||||
<para>
|
||||
As with <program>hb-shape</program>, a lengthy set of options
|
||||
is available, with which you can enable or disable
|
||||
specific font features, set variation-font axis values,
|
||||
alter the language, script, direction, and clustering settings
|
||||
used, enable sanity checks, or change which shaping engine is
|
||||
used.
|
||||
</para>
|
||||
<para>
|
||||
You can also set the foreground and background colors used for
|
||||
the output, independently control the width of all four
|
||||
margins, alter the line spacing, and annotate the output image
|
||||
with
|
||||
</para>
|
||||
<para>
|
||||
In general, <program>hb-view</program> is a quick way to
|
||||
verify that the output of HarfBuzz's shaping operation looks
|
||||
correct for a given text-and-font combination, but you may
|
||||
want to use <program>hb-shape</program> to figure out exactly
|
||||
why something does not appear as expected.
|
||||
</para>
|
||||
</section>
|
||||
|
||||
<section id="utilities-command-line-hbsubset">
|
||||
<title>hb-subset</title>
|
||||
<para>
|
||||
<emphasis><program>hb-subset</program></emphasis> allows you
|
||||
to generate a subset of a given font, with a limited set of
|
||||
supported characters, features, and variation settings.
|
||||
</para>
|
||||
<para>
|
||||
By default, you provide an input font and an input text string
|
||||
as the arguments to <program>hb-subset</program>, and it will
|
||||
generate a font that covers the input text exactly like the
|
||||
input font does, but includes no other characters or features.
|
||||
</para>
|
||||
<programlisting>
|
||||
<command>hb-subset</command> <optional>[OPTIONS]</optional>
|
||||
<parameter>path/to/font/file.ttf</parameter>
|
||||
<parameter>yourinputtext</parameter>
|
||||
</programlisting>
|
||||
<para>
|
||||
For example, to create a subset of Noto Serif that just includes the
|
||||
numerals and the lowercase Latin alphabet, you could run
|
||||
</para>
|
||||
<programlisting>
|
||||
<command>hb-subset</command> <optional>[OPTIONS]</optional>
|
||||
<parameter>NotoSerif-Regular.ttf</parameter>
|
||||
<parameter>0123456789abcdefghijklmnopqrstuvwxyz</parameter>
|
||||
</programlisting>
|
||||
<para>
|
||||
There are options available to remove hinting from the
|
||||
subsetted font and to specify a list of variation-axis settings.
|
||||
</para>
|
||||
</section>
|
||||
|
||||
</section>
|
||||
|
||||
<section id="utilities-common-types-apis">
|
||||
<title>Common data types and APIs</title>
|
||||
<para>
|
||||
HarfBuzz includes several APIs for working with general-purpose
|
||||
data that you may find convenient to leverage in your own
|
||||
software. They include set operations and integer-to-integer
|
||||
mapping operations.
|
||||
</para>
|
||||
<para>
|
||||
HarfBuzz uses set operations for internal bookkeeping, such as
|
||||
when it collects all of the glyph IDs covered by a particular
|
||||
font feature. You can also use the set API to build sets, add
|
||||
and remove elements, test whether or not sets contain particular
|
||||
elements, or compute the unions, intersections, or differences
|
||||
between sets.
|
||||
</para>
|
||||
<para>
|
||||
All set elements are integers (specifically,
|
||||
<type>hb_codepoint_t</type> 32-bit unsigned ints), and there are
|
||||
functions for fetching the minimum and maximum element from a
|
||||
set. The set API also includes some functions that might not
|
||||
be part of a generic set facility, such as the ability to add a
|
||||
contiguous range of integer elements to a set in bulk, and the
|
||||
ability to fetch the next-smallest or next-largest element.
|
||||
</para>
|
||||
<para>
|
||||
The HarfBuzz set API includes some conveniences as well. All
|
||||
sets are lifecycle-managed, just like other HarfBuzz
|
||||
objects. You increase the reference count on a set with
|
||||
<function>hb_set_reference()</function> and decrease it with
|
||||
<function>hb_set_destroy()</function>. You can also attach
|
||||
user data to a set, just like you can to blobs, buffers, faces,
|
||||
fonts, and other objects, and set destroy callbacks.
|
||||
</para>
|
||||
<para>
|
||||
HarfBuzz also provides an API for keeping track of
|
||||
integer-to-integer mappings. As with the set API, each integer is
|
||||
stored as an unsigned 32-bit <type>hb_codepoint_t</type>
|
||||
element. Maps, like other objects, are reference counted with
|
||||
reference and destroy functions, and you can attach user data to
|
||||
them. The mapping operations include adding and deleting
|
||||
integer-to-integer key:value pairs to the map, testing for the
|
||||
presence of a key, fetching the population of the map, and so on.
|
||||
</para>
|
||||
<para>
|
||||
There are several other internal HarfBuzz facilities that are
|
||||
exposed publicly and which you may want to take advantage of
|
||||
while processing text. HarfBuzz uses a common
|
||||
<type>hb_tag_t</type> for a variety of OpenType tag identifiers (for
|
||||
scripts, languages, font features, table names, variation-axis
|
||||
names, and more), and provides functions for converting strings
|
||||
to tags and vice-versa.
|
||||
</para>
|
||||
<para>
|
||||
Finally, HarfBuzz also includes data type for Booleans, bit
|
||||
masks, and other simple types.
|
||||
</para>
|
||||
</section>
|
||||
|
||||
<section id="utilities-ucdn">
|
||||
<title>UCDN</title>
|
||||
<para>
|
||||
HarfBuzz includes a copy of the <ulink
|
||||
url="https://github.com/grigorig/ucdn">UCDN</ulink> (Unicode
|
||||
Database and Normalization) library, which provides functions
|
||||
for accessing basic Unicode character properties, performing
|
||||
canonical composition, and performing both canonical and
|
||||
compatibility decomposition.
|
||||
</para>
|
||||
<para>
|
||||
Currently, UCDN supports direct queries for several more character
|
||||
properties than HarfBuzz's built-in set of Unicode functions
|
||||
does, such as the BiDirectional Class, East Asian Width, Paired
|
||||
Bracket and Resolved Linebreak properties. If you need to access
|
||||
more properties than HarfBuzz's internal implementation
|
||||
provides, using the built-in UCDN functions may be a useful solution.
|
||||
</para>
|
||||
<para>
|
||||
The built-in UCDN functions are compiled by default when
|
||||
building HarfBuzz from source, but this can be disabled with a
|
||||
compile-time switch.
|
||||
</para>
|
||||
</section>
|
||||
|
||||
</chapter>
|
Loading…
Reference in New Issue