Previously we made CGJ unskippable. Now, if CGJ did NOT prevent
any reordering, allow skipping over it. To make this work we
had to make changes to the Arabic mark reordering algorithm
implementation to renumber moved MCM marks. See comments.
Fixes https://github.com/harfbuzz/harfbuzz/issues/554
We are going to implement Unicode Arabic Mark Ordering Algorithm:
http://www.unicode.org/reports/tr53/tr53-1.pdf
which will reorder marks out of their sorted ccc order. Adjust
normalizer to stop combining as soon as dangerous ordering is
detected.
New API:
- hb_font_get_nominal_glyph_func_t
- hb_font_get_variation_glyph_func_t
- hb_font_funcs_set_nominal_glyph_func()
- hb_font_funcs_set_variation_glyph_func()
- hb_font_get_nominal_glyph()
- hb_font_get_variation_glyph()
Deprecated API:
- hb_font_get_glyph_func_t
- hb_font_funcs_set_glyph_func()
Clients that implement their own font-funcs are encouraged to replace
their get_glyph() implementation with a get_nominal_glyph() and
get_variation_glyph() pair. The variation version can assume that
variation_selector argument is not zero.
This resurrects the space fallback feature, after I disabled
the compatibility decomposition. Now I can release HarfBuzz
again without breaking Pango!
It also remembers which space character it was, such that later
on we can approximate the width of this particular space
character. That part is not implemented yet.
We normalize all GC=Zs chars except for U+1680 OGHA SPACE MARK,
which is better left alone.
We now handle U+FFFD replacement in hb_buffer_add_utf*(). Any other
manipulation can happen in user callbacks. No need for this.
efe74214bb (commitcomment-7039404)
This reverts commit efe74214bb.
Conflicts:
src/hb-ot-shape-normalize.cc
Only if the font doesn't support it. Ie, this gives the user to
use non-Unicode codepoints as private values and return a meaningful
glyph for them. But if it's invalid and font callback doesn't
like it, and if font has U+FFFD, show that instead.
Font functions that do not want this automatic replacement to
happen should return true from get_glyph() if unicode > 0x10FFFF.
Replaces https://github.com/behdad/harfbuzz/pull/27
This changes the semantics of get_glyph() callback and expect that
callbacks return false if the requested variant is not available, and
then we will call them back with variation_selector=0 and will retain
the glyph for the selector in the glyph stream.
Apparently most Mongolian fonts implement the Mongolian Variation
Selectors using GSUB, not cmap.
https://bugs.freedesktop.org/show_bug.cgi?id=65258
Note that this doesn't fix the Mongolian shaping yet, because the way
that's implemented is that the, say, 'init' feature ligates the letter
and the variation-selector. However, since currently the variation
selector doesn't have the 'init' mask on, it will not be matched...
See thread "an issue regarding discrepancy between Korean and Unicode
standards" on the mailing list for the rationale. In short: Uniscribe
doesn't, so fonts are designed to work without it.
Before, for most scripts, we were not trying to recompose two characters
if the second one had ccc=0. That fails for Myanmar where U+1026
decomposes to U+1025,U+102E, both of which have ccc=0. However, we do
want to try to recompose those. We now check whether the second is a
mark, using general category instead.
At the same time, remove optimization that was conflicting with this.
[Let the Ngapi hackfest begin!]