[Docs] Usermanual; add Shaping, Features, and Plans.
This commit is contained in:
parent
8354c99fbe
commit
d0f5a05aef
|
@ -6,14 +6,289 @@
|
|||
]>
|
||||
<chapter id="shaping-and-shape-plans">
|
||||
<title>Shaping and shape plans</title>
|
||||
<section id="opentype-features">
|
||||
<para>
|
||||
Once you have your face and font objects configured as desired and
|
||||
your input buffer is filled with the characters you need to shape,
|
||||
all you need to do is call <function>hb_shape()</function>.
|
||||
</para>
|
||||
<para>
|
||||
HarfBuzz will return the shaped version of the text in the same
|
||||
buffer that you provided, but it will be in output mode. At that
|
||||
point, you can iterate through the glyphs in the buffer, drawing
|
||||
each one at the specified position or handing them off to the
|
||||
appropriate graphics library.
|
||||
</para>
|
||||
<para>
|
||||
For the most part, HarfBuzz's shaping step is straightforward from
|
||||
the outside. But that doesn't mean there will never be cases where
|
||||
you want to look under the hood and see what is happening on the
|
||||
inside. HarfBuzz provides facilities for doing that, too.
|
||||
</para>
|
||||
|
||||
<section id="shaping-buffer-output">
|
||||
<title>Shaping and buffer output</title>
|
||||
<para>
|
||||
The <function>hb_shape()</function> function call takes four arguments: the font
|
||||
object to use, the buffer of characters to shape, an array of
|
||||
user-specified features to apply, and the length of that feature
|
||||
array. The feature array can be NULL, so for the sake of
|
||||
simplicity we will start with that case.
|
||||
</para>
|
||||
<para>
|
||||
Internally, HarfBuzz looks at the tables of the font file to
|
||||
determine where glyph classes, substitutions, and positioning
|
||||
are defined, using that information to decide which
|
||||
<emphasis>shaper</emphasis> to use (<literal>ot</literal> for
|
||||
OpenType fonts, <literal>aat</literal> for Apple Advanced
|
||||
Typography fonts, and so on). It also looks at the direction,
|
||||
script, and language properties of the segment to figure out
|
||||
which script-specific shaping model is needed (at least, in
|
||||
shapers that support multiple options).
|
||||
</para>
|
||||
<para>
|
||||
If a font has a GDEF table, then that is used for
|
||||
glyph classes; if not, HarfBuzz will fall back to Unicode
|
||||
categorization by code point. If a font has an AAT "morx" table,
|
||||
then it is used for substitutions; if not, but there is a GSUB
|
||||
table, then the GSUB table is used. If the font has an AAT
|
||||
"kerx" table, then it is used for positioning; if not, but
|
||||
there is a GPOS table, then the GPOS table is used. If neither
|
||||
table is found, but there is a "kern" table, then HarfBuzz will
|
||||
use the "kern" table. If there is no "kerx", no GPOS, and no
|
||||
"kern", HarfBuzz will fall back to positioning marks itself.
|
||||
</para>
|
||||
<para>
|
||||
With a well-behaved OpenType font, you expect GDEF, GSUB, and
|
||||
GPOS tables to all be applied. HarfBuzz implements the
|
||||
script-specific shaping models in internal functions, rather
|
||||
than in the public API.
|
||||
</para>
|
||||
<para>
|
||||
The algorithms
|
||||
used for complex scripts can be quite involved; HarfBuzz tries
|
||||
to be 100% compatible with the OpenType Layout specification
|
||||
and, wherever there is any ambiguity, HarfBuzz attempts to replicate the
|
||||
output of Microsoft's Uniscribe engine. See the <ulink
|
||||
url="https://docs.microsoft.com/en-us/typography/script-development/standard">Microsoft
|
||||
Typeography pages</ulink> for more detail.
|
||||
</para>
|
||||
<para>
|
||||
In general, though, all that you need to know if that
|
||||
<function>hb-shape()</function> returns the results of shaping
|
||||
in the same buffer that you provided. The buffer's content type
|
||||
will now be set to
|
||||
<literal>HB_BUFFER_CONTENT_TYPE_GLYPHS</literal>, indicating
|
||||
that it contains shaped output, rather than input text. You can
|
||||
now extract the glyph information and positioning arrays:
|
||||
</para>
|
||||
<programlisting language="C">
|
||||
hb_glyph_info_t *glyph_info = hb_buffer_get_glyph_infos(buf, &glyph_count);
|
||||
hb_glyph_position_t *glyph_pos = hb_buffer_get_glyph_positions(buf, &glyph_count);
|
||||
</programlisting>
|
||||
<para>
|
||||
The glyph information array holds a <type>hb_glyph_info_t</type>
|
||||
for each output glyph, which has two fields:
|
||||
<parameter>codepoint</parameter> and
|
||||
<parameter>cluster</parameter>. Whereas, in the input buffer,
|
||||
the <parameter>codepoint</parameter> field contained the Unicode
|
||||
code point, it now contains the glyph ID of the corresponding
|
||||
glyph in the font. The <parameter>cluster</parameter> field is
|
||||
an integer that you can use to help identify when shaping has
|
||||
reordered, split, or combined code points; we will say more
|
||||
about that in the next chapter.
|
||||
</para>
|
||||
<para>
|
||||
The glyph positions array holds a corresponding
|
||||
<type>hb_glyph_position_t</type> for each output glyph,
|
||||
containing four fields: <parameter>x_advance</parameter>,
|
||||
<parameter>y_advance</parameter>,
|
||||
<parameter>x_offset</parameter>, and
|
||||
<parameter>y_offset</parameter>. The advances tell you how far
|
||||
you need to move the drawing point after drawing this glyph,
|
||||
depending on whether you are setting horizontal text (in which
|
||||
case you will have x advances) or vertical text (for which you
|
||||
will have y advances). The x and y offsets tell you where to
|
||||
move to start drawing the glyph; usually you will have both and
|
||||
x and a y offset, regardless of the text direction.
|
||||
</para>
|
||||
<para>
|
||||
Most of the time, you will rely on a font-rendering library or
|
||||
other graphics library to do the actual drawing of glyphs, so
|
||||
you will need to iterate through the glyphs in the buffer and
|
||||
pass the corresponding values off.
|
||||
</para>
|
||||
</section>
|
||||
|
||||
<section id="shaping-opentype-features">
|
||||
<title>OpenType features</title>
|
||||
<para>
|
||||
OpenType features enable fonts to include smart behavior,
|
||||
implemented as "lookup" rules stored in the GSUB and GPOS
|
||||
tables. The OpenType specification defines a long list of
|
||||
standard features that fonts can use for these behaviors; each
|
||||
feature has a four-character reserved name and a well-defined
|
||||
semantic meaning.
|
||||
</para>
|
||||
<para>
|
||||
Some OpenType features are defined for the purpose of supporting
|
||||
complex-script shaping, and are automatically activated, but
|
||||
only when a buffer's script property is set to a script that the
|
||||
feature supports.
|
||||
</para>
|
||||
<para>
|
||||
Other features are more generic and can apply to several (or
|
||||
any) script, and shaping engines are expected to implement
|
||||
them. By default, HarfBuzz activates several of these features
|
||||
on every text run. They include <literal>ccmp</literal>,
|
||||
<literal>locl</literal>, <literal>mark</literal>,
|
||||
<literal>mkmk</literal>, and <literal>rlig</literal>.
|
||||
</para>
|
||||
<para>
|
||||
In addition, if the text direction is horizontal, HarfBuzz
|
||||
also applies the <literal>calt</literal>,
|
||||
<literal>clig</literal>, <literal>curs</literal>,
|
||||
<literal>kern</literal>, <literal>liga</literal>,
|
||||
<literal>rclt</literal>, and <literal>frac</literal> features.
|
||||
</para>
|
||||
<para>
|
||||
If the text direction is vertical, HarfBuzz applies
|
||||
the <literal>vert</literal> feature by default.
|
||||
</para>
|
||||
<para>
|
||||
Still other features are designed to be purely optional and left
|
||||
up to the application or the end user to enable or disable as desired.
|
||||
</para>
|
||||
<para>
|
||||
You can adjust the set of features that HarfBuzz applies to a
|
||||
buffer by supplying an array of <type>hb_feature_t</type>
|
||||
features as the third argument to
|
||||
<function>hb_shape()</function>. For a simple case, let's just
|
||||
enable the <literal>dlig</literal> feature, which turns on any
|
||||
"discretionary" ligatures in the font:
|
||||
</para>
|
||||
<programlisting language="C">
|
||||
unsigned int num_features = 1;
|
||||
hb_feature_t userfeatures[num_features];
|
||||
userfeatures[0].tag = HB_TAG('d','l','i','g');
|
||||
userfeatures[0].value = 1;
|
||||
userfeatures[0].start = 0;
|
||||
userfeatures[0].end = (unsigned int) -1;
|
||||
</programlisting>
|
||||
<para>
|
||||
When we pass the <varname>userfeatures</varname> array to
|
||||
<function>hb_shape()</function>, any discretionary ligature
|
||||
substitutions from our font that match the text in our buffer
|
||||
will get performed:
|
||||
</para>
|
||||
<programlisting language="C">
|
||||
hb_shape(font, buf, userfeatures, num_features);
|
||||
</programlisting>
|
||||
<para>
|
||||
Just like we enabled the <literal>dlig</literal> feature by
|
||||
setting its <parameter>value</parameter> to
|
||||
<literal>1</literal>, you would disable a feature by setting its
|
||||
<parameter>value</parameter> to <literal>0</literal>.
|
||||
</para>
|
||||
</section>
|
||||
<section id="plans-and-caching">
|
||||
|
||||
<section id="shaping-shaper-selection">
|
||||
<title>Shaper selection</title>
|
||||
<para>
|
||||
The basic version of <function>hb_shape()</function> determines
|
||||
its shaping strategy based on examining the capabilities of the
|
||||
font file. OpenType font tables cause HarfBuzz to try the
|
||||
<literal>ot</literal> shaper, while AAT font tables cause HarfBuzz to try the
|
||||
<literal>aat</literal> shaper.
|
||||
</para>
|
||||
<para>
|
||||
In the real world, however, a font might include some unusual
|
||||
mix of tables, or one of the tables might simply be broken for
|
||||
the script you need to shape. So, sometimes, you might not
|
||||
want to rely on HarfBuzz's process for deciding what to do, and
|
||||
just tell hb-shape what you want it to try.
|
||||
</para>
|
||||
<para>
|
||||
<function>hb_shape_full()</function> is an alternate shaping
|
||||
function that lets you supply a list of shapers for HarfBuzz to
|
||||
try, in order, when shaping your buffer. For example, if you
|
||||
have determined that HarfBuzz's attempts to work around broken
|
||||
tables gives you better results than the AAT shaper itself does,
|
||||
you might move the AAT shaper to the end of your list of
|
||||
preferences and call <function>hb_shape_full()</function>
|
||||
</para>
|
||||
<programlisting language="C">
|
||||
char *shaperprefs[3] = {"ot", "default", "aat"};
|
||||
...
|
||||
hb_shape_full(font, buf, userfeatures, num_features, shaperprefs);
|
||||
</programlisting>
|
||||
<para>
|
||||
to get results you are happier with.
|
||||
</para>
|
||||
<para>
|
||||
You may also want to call
|
||||
<function>hb_shape_list_shapers()</function> to get a list of
|
||||
the shapers that were built at compile time in your copy of HarfBuzz.
|
||||
</para>
|
||||
</section>
|
||||
|
||||
<section id="shaping-plans-and-caching">
|
||||
<title>Plans and caching</title>
|
||||
<para>
|
||||
Internally, HarfBuzz uses a structure called a shape plan to
|
||||
track its decisions about how to shape the contents of a
|
||||
buffer. The hb-shape function builds up the shape plan by
|
||||
examining segment properties and by inspecting the contents of
|
||||
the font.
|
||||
</para>
|
||||
<para>
|
||||
This process can involve some decision-making and
|
||||
trade-offs — for example, HarfBuzz inspects the GSUB and GPOS
|
||||
lookups for the script and language tags set on the segment
|
||||
properties, but it falls back on the lookups under the
|
||||
<literal>DFLT</literal> tag (and sometimes other common tags)
|
||||
if there are actually no lookups for the tag requested.
|
||||
</para>
|
||||
<para>
|
||||
HarfBuzz also includes some work-arounds for
|
||||
handling well-known older font conventions that do not follow
|
||||
OpenType or Unicode specifications, for buggy system fonts, and for
|
||||
peculiarities of Microsoft Uniscribe. All of that means that a
|
||||
shape plan, while not something that you should edit directly in
|
||||
client code, still might be an object that you want to
|
||||
inspect. Furthermore, if resources are tight, you might want to
|
||||
cache the shape plan that HarfBuzz builds for your buffer and
|
||||
font, so that you do not have to rebuild it for every shaping call.
|
||||
</para>
|
||||
<para>
|
||||
You can create a cacheable shape plan with
|
||||
<function>hb_shape_plan_create_cached(face, props,
|
||||
user_features, num_user_features, shaper_list)</function>, where
|
||||
<parameter>face</parameter> is a face object (not a font object,
|
||||
notably), <parameter>props</parameter> is an
|
||||
<type>hb_segment_properties_t</type>,
|
||||
<parameter>user_features</parameter> is an array of
|
||||
<type>hb_feature_t</type>s (with length
|
||||
<parameter>num_user_features</parameter>), and
|
||||
<parameter>shaper_list</parameter> is a list of shapers to try.
|
||||
</para>
|
||||
<para>
|
||||
Shape plans are objects in HarfBuzz, so there are
|
||||
reference-counting functions and user-data attachment functions
|
||||
you can
|
||||
use. <function>hb_shape_plan_reference(shape_plan)</function>
|
||||
increases the reference count on a shape plan, while
|
||||
<function>hb_shape_plan_destroy(shape_plan)</function> decreases
|
||||
the reference count, destroying the shape plan when the last
|
||||
reference is dropped.
|
||||
</para>
|
||||
<para>
|
||||
You can attach user data to a shaper (with a key) using the
|
||||
<function>hb_shape_plan_set_user_data(shape_plan,key,data,destroy,replace)</function>
|
||||
function, optionally supplying a <function>destroy</function>
|
||||
callback to use. You can then fetch the user data attached to a
|
||||
shape plan with
|
||||
<function>hb_shape_plan_get_user_data(shape_plan, key)</function>.
|
||||
</para>
|
||||
</section>
|
||||
|
||||
</chapter>
|
||||
|
|
Loading…
Reference in New Issue