Getting started with HarfBuzz
An overview of the HarfBuzz shaping API The core of the HarfBuzz shaping API is the function hb_shape(). This function takes a font, a buffer containing a string of Unicode codepoints and (optionally) a list of font features as its input. It replaces the codepoints in the buffer with the corresponding glyphs from the font, correctly ordered and positioned, and with any of the optional font features applied. In addition to holding the pre-shaping input (the Unicode codepoints that comprise the input string) and the post-shaping output (the glyphs and positions), a HarfBuzz buffer has several properties that affect shaping. The most important are the text-flow direction (e.g., left-to-right, right-to-left, top-to-bottom, or bottom-to-top), the script tag, and the language tag. HarfBuzz can attempt to guess the correct values for the buffer based on its contents if you do not set them explicitly. For input string buffers, flags are available to denote when the buffer represents the beginning or end of a paragraph, to indicate whether or not to visibly render Unicode Default Ignorable codepoints, and to modify the cluster-merging behavior for the buffer. For shaped output buffers, the individual X and Y offsets and widths of each glyph are accessible. HarfBuzz also flags glyphs as UNSAFE_TO_BREAK if breaking the string at that glyph (e.g., in a line-breaking or hyphenation process) would alter the shaping output for the buffer. HarfBuzz also provides methods to compare the contents of buffers, join buffers, normalize buffer contents, and handle invalid codepoints, as well as to determine the state of a buffer (e.g., input codepoints or output glyphs). Buffer lifecycles are managed and all buffers are reference-counted. Although the default hb_shape() function is sufficient for most use cases, a variant is also provide that lets you specify which of HarfBuzz's shapers to use on a buffer. HarfBuzz can read TrueType fonts, TrueType collections, OpenType fonts, and OpenType collections. Functions are provided to query font objects about metrics, Unicode coverage, available tables and features, and variation selectors. Individual glyphs can also be queried for metrics, variations, and glyph names. OpenType variable fonts are supported, and HarfBuzz allows you to set variation-axis coordinates on font objects. HarfBuzz provides glue code to integrate with FreeType, GObject, Uniscribe, and CoreText. Support for integrating with DirectWrite is experimental at present.
Terminology shaper In HarfBuzz, a shaper is a handler for a specific script shaping model. HarfBuzz implements separate shapers for Indic, Arabic, Thai and Lao, Khmer, Myanmar, Tibetan, Hangul, Hebrew, the Universal Shaping Engine (USE), and a default shaper for non-complex scripts. cluster In text shaping, a cluster is a sequence of codepoints that must be handled as an indivisible unit. Clusters can include codepoint sequences that form a ligature or base-and-mark sequences. Tracking and preserving clusters is important when shaping operations might separate or reorder codepoints. HarfBuzz provides three cluster levels that implement different approaches to the problem of preserving clusters during shaping operations.
A simple shaping example Below is the simplest HarfBuzz shaping example possible. Create a buffer and put your text in it. #include <hb.h> hb_buffer_t *buf; buf = hb_buffer_create(); hb_buffer_add_utf8(buf, text, strlen(text), 0, strlen(text)); Guess the script, language and direction of the buffer. hb_buffer_guess_segment_properties(buf); Create a face and a font, using FreeType for now. #include <hb-ft.h> FT_New_Face(ft_library, font_path, index, &face) hb_font_t *font = hb_ft_font_create(face); Shape! hb_shape(font, buf, NULL, 0); Get the glyph and position information. hb_glyph_info_t *glyph_info = hb_buffer_get_glyph_infos(buf, &glyph_count); hb_glyph_position_t *glyph_pos = hb_buffer_get_glyph_positions(buf, &glyph_count); Iterate over each glyph. for (i = 0; i < glyph_count; ++i) { glyphid = glyph_info[i].codepoint; x_offset = glyph_pos[i].x_offset / 64.0; y_offset = glyph_pos[i].y_offset / 64.0; x_advance = glyph_pos[i].x_advance / 64.0; y_advance = glyph_pos[i].y_advance / 64.0; draw_glyph(glyphid, cursor_x + x_offset, cursor_y + y_offset); cursor_x += x_advance; cursor_y += y_advance; } Tidy up. hb_buffer_destroy(buf); hb_font_destroy(hb_ft_font); This example shows enough to get us started using HarfBuzz. In the sections that follow, we will use the remainder of HarfBuzz's API to refine and extend the example and improve its text-shaping capabilities.