diff --git a/docs/usermanual-opentype-features.xml b/docs/usermanual-opentype-features.xml index 51ff55a77..4405328bf 100644 --- a/docs/usermanual-opentype-features.xml +++ b/docs/usermanual-opentype-features.xml @@ -6,14 +6,289 @@ ]> Shaping and shape plans -
+ + Once you have your face and font objects configured as desired and + your input buffer is filled with the characters you need to shape, + all you need to do is call hb_shape(). + + + HarfBuzz will return the shaped version of the text in the same + buffer that you provided, but it will be in output mode. At that + point, you can iterate through the glyphs in the buffer, drawing + each one at the specified position or handing them off to the + appropriate graphics library. + + + For the most part, HarfBuzz's shaping step is straightforward from + the outside. But that doesn't mean there will never be cases where + you want to look under the hood and see what is happening on the + inside. HarfBuzz provides facilities for doing that, too. + + +
+ Shaping and buffer output + + The hb_shape() function call takes four arguments: the font + object to use, the buffer of characters to shape, an array of + user-specified features to apply, and the length of that feature + array. The feature array can be NULL, so for the sake of + simplicity we will start with that case. + + + Internally, HarfBuzz looks at the tables of the font file to + determine where glyph classes, substitutions, and positioning + are defined, using that information to decide which + shaper to use (ot for + OpenType fonts, aat for Apple Advanced + Typography fonts, and so on). It also looks at the direction, + script, and language properties of the segment to figure out + which script-specific shaping model is needed (at least, in + shapers that support multiple options). + + + If a font has a GDEF table, then that is used for + glyph classes; if not, HarfBuzz will fall back to Unicode + categorization by code point. If a font has an AAT "morx" table, + then it is used for substitutions; if not, but there is a GSUB + table, then the GSUB table is used. If the font has an AAT + "kerx" table, then it is used for positioning; if not, but + there is a GPOS table, then the GPOS table is used. If neither + table is found, but there is a "kern" table, then HarfBuzz will + use the "kern" table. If there is no "kerx", no GPOS, and no + "kern", HarfBuzz will fall back to positioning marks itself. + + + With a well-behaved OpenType font, you expect GDEF, GSUB, and + GPOS tables to all be applied. HarfBuzz implements the + script-specific shaping models in internal functions, rather + than in the public API. + + + The algorithms + used for complex scripts can be quite involved; HarfBuzz tries + to be 100% compatible with the OpenType Layout specification + and, wherever there is any ambiguity, HarfBuzz attempts to replicate the + output of Microsoft's Uniscribe engine. See the Microsoft + Typeography pages for more detail. + + + In general, though, all that you need to know if that + hb-shape() returns the results of shaping + in the same buffer that you provided. The buffer's content type + will now be set to + HB_BUFFER_CONTENT_TYPE_GLYPHS, indicating + that it contains shaped output, rather than input text. You can + now extract the glyph information and positioning arrays: + + + hb_glyph_info_t *glyph_info = hb_buffer_get_glyph_infos(buf, &glyph_count); + hb_glyph_position_t *glyph_pos = hb_buffer_get_glyph_positions(buf, &glyph_count); + + + The glyph information array holds a hb_glyph_info_t + for each output glyph, which has two fields: + codepoint and + cluster. Whereas, in the input buffer, + the codepoint field contained the Unicode + code point, it now contains the glyph ID of the corresponding + glyph in the font. The cluster field is + an integer that you can use to help identify when shaping has + reordered, split, or combined code points; we will say more + about that in the next chapter. + + + The glyph positions array holds a corresponding + hb_glyph_position_t for each output glyph, + containing four fields: x_advance, + y_advance, + x_offset, and + y_offset. The advances tell you how far + you need to move the drawing point after drawing this glyph, + depending on whether you are setting horizontal text (in which + case you will have x advances) or vertical text (for which you + will have y advances). The x and y offsets tell you where to + move to start drawing the glyph; usually you will have both and + x and a y offset, regardless of the text direction. + + + Most of the time, you will rely on a font-rendering library or + other graphics library to do the actual drawing of glyphs, so + you will need to iterate through the glyphs in the buffer and + pass the corresponding values off. + +
+ +
OpenType features + OpenType features enable fonts to include smart behavior, + implemented as "lookup" rules stored in the GSUB and GPOS + tables. The OpenType specification defines a long list of + standard features that fonts can use for these behaviors; each + feature has a four-character reserved name and a well-defined + semantic meaning. + + + Some OpenType features are defined for the purpose of supporting + complex-script shaping, and are automatically activated, but + only when a buffer's script property is set to a script that the + feature supports. + + + Other features are more generic and can apply to several (or + any) script, and shaping engines are expected to implement + them. By default, HarfBuzz activates several of these features + on every text run. They include ccmp, + locl, mark, + mkmk, and rlig. + + + In addition, if the text direction is horizontal, HarfBuzz + also applies the calt, + clig, curs, + kern, liga, + rclt, and frac features. + + + If the text direction is vertical, HarfBuzz applies + the vert feature by default. + + + Still other features are designed to be purely optional and left + up to the application or the end user to enable or disable as desired. + + + You can adjust the set of features that HarfBuzz applies to a + buffer by supplying an array of hb_feature_t + features as the third argument to + hb_shape(). For a simple case, let's just + enable the dlig feature, which turns on any + "discretionary" ligatures in the font: + + + unsigned int num_features = 1; + hb_feature_t userfeatures[num_features]; + userfeatures[0].tag = HB_TAG('d','l','i','g'); + userfeatures[0].value = 1; + userfeatures[0].start = 0; + userfeatures[0].end = (unsigned int) -1; + + + When we pass the userfeatures array to + hb_shape(), any discretionary ligature + substitutions from our font that match the text in our buffer + will get performed: + + + hb_shape(font, buf, userfeatures, num_features); + + + Just like we enabled the dlig feature by + setting its value to + 1, you would disable a feature by setting its + value to 0.
-
+ +
+ Shaper selection + + The basic version of hb_shape() determines + its shaping strategy based on examining the capabilities of the + font file. OpenType font tables cause HarfBuzz to try the + ot shaper, while AAT font tables cause HarfBuzz to try the + aat shaper. + + + In the real world, however, a font might include some unusual + mix of tables, or one of the tables might simply be broken for + the script you need to shape. So, sometimes, you might not + want to rely on HarfBuzz's process for deciding what to do, and + just tell hb-shape what you want it to try. + + + hb_shape_full() is an alternate shaping + function that lets you supply a list of shapers for HarfBuzz to + try, in order, when shaping your buffer. For example, if you + have determined that HarfBuzz's attempts to work around broken + tables gives you better results than the AAT shaper itself does, + you might move the AAT shaper to the end of your list of + preferences and call hb_shape_full() + + + char *shaperprefs[3] = {"ot", "default", "aat"}; + ... + hb_shape_full(font, buf, userfeatures, num_features, shaperprefs); + + + to get results you are happier with. + + + You may also want to call + hb_shape_list_shapers() to get a list of + the shapers that were built at compile time in your copy of HarfBuzz. + +
+ +
Plans and caching + Internally, HarfBuzz uses a structure called a shape plan to + track its decisions about how to shape the contents of a + buffer. The hb-shape function builds up the shape plan by + examining segment properties and by inspecting the contents of + the font. + + + This process can involve some decision-making and + trade-offs — for example, HarfBuzz inspects the GSUB and GPOS + lookups for the script and language tags set on the segment + properties, but it falls back on the lookups under the + DFLT tag (and sometimes other common tags) + if there are actually no lookups for the tag requested. + + + HarfBuzz also includes some work-arounds for + handling well-known older font conventions that do not follow + OpenType or Unicode specifications, for buggy system fonts, and for + peculiarities of Microsoft Uniscribe. All of that means that a + shape plan, while not something that you should edit directly in + client code, still might be an object that you want to + inspect. Furthermore, if resources are tight, you might want to + cache the shape plan that HarfBuzz builds for your buffer and + font, so that you do not have to rebuild it for every shaping call. + + + You can create a cacheable shape plan with + hb_shape_plan_create_cached(face, props, + user_features, num_user_features, shaper_list), where + face is a face object (not a font object, + notably), props is an + hb_segment_properties_t, + user_features is an array of + hb_feature_ts (with length + num_user_features), and + shaper_list is a list of shapers to try. + + + Shape plans are objects in HarfBuzz, so there are + reference-counting functions and user-data attachment functions + you can + use. hb_shape_plan_reference(shape_plan) + increases the reference count on a shape plan, while + hb_shape_plan_destroy(shape_plan) decreases + the reference count, destroying the shape plan when the last + reference is dropped. + + + You can attach user data to a shaper (with a key) using the + hb_shape_plan_set_user_data(shape_plan,key,data,destroy,replace) + function, optionally supplying a destroy + callback to use. You can then fetch the user data attached to a + shape plan with + hb_shape_plan_get_user_data(shape_plan, key).
+