I like to have a mode where CONTAINS_NOTDEF and CONTAINS_DOTTEDCIRCLE are not
returned. Abused a value of -1 for that. hb-shape now uses it. Fixes two
of the six tests failing with --verify in test/shaping/run-tests.sh.
Change hb_buffer_diff to explicitly cast result of abs to unsigned when
comparing with position_fuzz to avoid unsafe signed/unsigned comparions
warnings on windows.
We break and shape fragments and reconstruct shape result from them.
Remains to compare to original buffer. Going to add some buffer
comparison API and use here, instead of open-coding.
We were relying on cluster merges not requiring unsafe flagging because
they get merged. If cluster level requests no merging, then we flag
unsafe when merge would have happened.
Fixes https://github.com/behdad/harfbuzz/issues/294
Also fixes a bunch of other Indic issues. Test results after:
BENGALI: 353725 out of 354188 tests passed. 463 failed (0.130722%)
DEVANAGARI: 707307 out of 707394 tests passed. 87 failed (0.0122987%)
GUJARATI: 366355 out of 366457 tests passed. 102 failed (0.0278341%)
GURMUKHI: 60732 out of 60747 tests passed. 15 failed (0.0246926%)
KANNADA: 951201 out of 951913 tests passed. 712 failed (0.0747968%)
KHMER: 299071 out of 299124 tests passed. 53 failed (0.0177184%)
MALAYALAM: 1048136 out of 1048334 tests passed. 198 failed (0.0188871%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271662 out of 271847 tests passed. 185 failed (0.068053%)
TAMIL: 1091754 out of 1091754 tests passed. 0 failed (0%)
TELUGU: 970555 out of 970573 tests passed. 18 failed (0.00185457%)
Before:
BENGALI: 353725 out of 354188 tests passed. 463 failed (0.130722%)
DEVANAGARI: 707307 out of 707394 tests passed. 87 failed (0.0122987%)
GUJARATI: 366349 out of 366457 tests passed. 108 failed (0.0294714%)
GURMUKHI: 60732 out of 60747 tests passed. 15 failed (0.0246926%)
KANNADA: 951190 out of 951913 tests passed. 723 failed (0.0759523%)
KHMER: 299070 out of 299124 tests passed. 54 failed (0.0180527%)
MALAYALAM: 1048136 out of 1048334 tests passed. 198 failed (0.0188871%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271662 out of 271847 tests passed. 185 failed (0.068053%)
TAMIL: 1091753 out of 1091754 tests passed. 1 failed (9.15957e-05%)
TELUGU: 970555 out of 970573 tests passed. 18 failed (0.00185457%)
* Undone change for Tibetan vowel u
* removed comment on reordering that became invalid with roll-back
* Support for Dzongkha contractions with multiple vowel-signs
* Removed non-functional and unnecessary defines for HB_MODIFIED_COMBINING_CLASS_CCC138,140
If an application provides a malloc replacement through
hb_malloc_impl() it is important that it is used to allocate
everything, but the use of strdup() circumvents this and
causes system malloc() to be called instead. This pairs
badly with the custom hb_free_impl() being called later.
clang's new -Wcomma compiler option warns about possible misuse of the
comma operator such as between two statements.
hb-common.cc:190:9 [-Wcomma] possible misuse of comma operator here
hb-ot-layout-gsubgpos-private.hh:345:30 [-Wcomma] possible misuse of
comma operator here
hb-shape-plan.cc:438:26 [-Wcomma] possible misuse of comma operator here
There are more broken versions of Tahoma out there on various Windows releases,
so we need to add them to our blacklist to avoid broken rendering.
See https://bugzilla.mozilla.org/show_bug.cgi?id=1279925 for details.
Visual Studio only supported strtof() from Visual Studio 2013 onwards, so
use strtod() instead to do the operation, which should do the same thing,
sans going to a double, not a float.
In https://crbug.com/681813 another instance of Padauk was identified
triggering collapsed glyphs. Blacklist this version by patching
hb-ot-layout.cc to print out gdef, gsub, and gpos table length, then
adding those to the list of blacklisted versions.
* Guard against underflow when adjusting length
With the fuzz-testcase in mozilla bug 1295299, we end up with a recursed lookup that removes 3 items, when `match_positions[idx]` is 0, which results in (unsigned) `end` wrapping to a huge value.
Making `end` a signed int is probably the simplest route to a fix.
Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1295299.
* Add testcase for #421.
* [indic] Add support for Grantha marks that may be used in Tamil to the Indic table.
See https://bugzilla.mozilla.org/show_bug.cgi?id=1331339.
Testcase: U+0BA4,U+0BC6,U+1133c,U+0BAA,U+1133c,U+0BC6,U+1133c
* [indic] Add test for Grantha nukta that is allowed in Tamil by ScriptExtensions.txt
While it’s fine to call memcmp(x, 0, 0) in practice, the C99 standard
explicitly says that this is not allowed: even if the length is zero,
the pointer arguments must be valid.
http://stackoverflow.com/a/16363034
Coverity ID: 141178
Signed-off-by: Philip Withnall <withnall@endlessm.com>
The 'avar' table does not allow random access to axis maps,
so change API to avoid quadratic-time implementation.
Removed -hb_ot_var_normalize_axis_value(), added
+hb_ot_var_normalize_variations() and
+hb_ot_var_normalize_coords() instead.
The numbers for right-to-left scripts are processed also from right to
left, so the order of applying “numr” and “dnom” features should be
reversed in such case.
Fixes https://github.com/behdad/harfbuzz/issues/395
We have had added this in Indic shaper to assist shaping these scripts.
In Universal Shaping Engine however, it is up to font designer to
decompose them. Hence moving them from Indic shaper to USE was
wrong.
Fixup for f6ba63b2e8
Part of fixing https://github.com/behdad/harfbuzz/issues/387
The decomposition is very obscure and unlikely to help
any fonts. Just remove it since Uniscribe probably doesn't
do this either.
Fixes https://github.com/behdad/harfbuzz/issues/382
* this pointer in type definitions is not interpreted as a constant.
This rule is not enforced strictly by all compilers, but the Green Hills Software compiler will regard this as an error.
* Merging branches for the DEFINE_SIZE_UNION macro
Adding check for the existence of static_size field in the tested member.
New approach to fix this:
69f9fbc420
Previous approach was reverted as it was too broad. See context:
https://github.com/behdad/harfbuzz/issues/347#issuecomment-267838368
With U+05E9,U+05B8,U+05C1,U+05DC and Arial Unicode, we now (correctly) disable
GDEF and GPOS, so we get results very close to Uniscribe, but slightly different
since our fallback position logic is not exactly the same:
Before: [gid1166=3+991|gid1142=0+737|gid5798=0+1434]
After: [gid1166=3+991|gid1142=0@402,-26+0|gid5798=0+1434]
Uniscribe: [gid1166=3+991|gid1142=0@348,0+0|gid5798=0+1434]
It is desirable to be able to build against older versions of glib.
fd7a245 changed the configure check to require glib > 2.38 for
G_TEST_DIST. Before that, version 2.16 was required, but in fact,
since aafe395, G_PASTE is being used, which was introduced in 2.19.1.
And since 0ef179e2, hb-glib uses GBytes, which were introduced in
2.31.10.
2.19.1 is rather old, but 2.38 is rather new. For Firefox, building
against 2.22 is still supported, although we could probably get away
with bumping that to 2.28. Either way, GBytes is not available.
Arguably, if you build against a glib that doesn't support GBytes,
you're not going to use the hb_glib_blob_create function, so we hide
the function when building against such a glib.
As for G_TEST_DIST, when building against versions of glib that don't
support it, we can fallback to the previous behavior, which, AIUI, was
just making the test not work when building in a separate directory.
Note, the function now returns "half of horizontal advance width"
if top accent attachment for glyph is not explicitly defined.
This is what the spec requires. Updated tests.
This one:
map->mask = (1 << (next_bit + bits_needed)) - (1 << next_bit);
before the fix, the shift was done as an int, causing overflow
if it ever got to 1 << 31. Sprinkle 'u's around.
Fixes https://bugs.chromium.org/p/chromium/issues/detail?id=634805
This is based on bug reports that have been filed against Firefox since it
updated to a version of harfbuzz that uses zeroing by GDEF rather than by
Unicode. I'm sure there are a bunch more font versions that should also be
included; these are just the ones I have on hand and have confirmed as having
bad GDEF data.
Given how the list here is growing, I think we should reconsider the approach,
and perhaps revert to zeroing by Unicode instead.
Fixes https://github.com/behdad/harfbuzz/issues/264
Fixes https://github.com/behdad/harfbuzz/pull/266
Prior to this change the function `FT_Error FT_Done_Face(FT_Face *)` was
called through a pointer with the signature `void (void *)` resulting in
undefined behaviour.
Fixes https://github.com/behdad/harfbuzz/issues/243
With javatext.ttf, the reodering medial Ra gets its advance width
zero'ed in Uniscribe implementation, and the font adds the advance
back. Our Indic shaper does not do that, but USE does. So, route
Javanese through USE. That's what Microsoft does anyway. Test:
U+A9A5,U+A9BA
This also seems to fix the following sequence, and variations thereof:
U+A99F,U+A9C0,U+A9A2,U+A9BF
Apparently some clients have reference-table callbacks that copy the table.
As such, avoid loading 'glyf' table which is only needed if fallback positioning
happens.
Fixes https://github.com/behdad/harfbuzz/issues/237
Note that Uniscribe enforces a certain syllable order. We don't.
But with this change, I get all of the tibetan contractions pass
with Microsoft Himalaya font.
Previously we only synthesized GDEF glyph classes if the glyphClassDef
array in GDEF was null. This worked well enough, and is indeed what
OpenType requires: "If the font does not include a GlyphClassDef table,
the client must define and maintain this information when using the
GSUB and GPOS tables." That sentence does not quite make sense since
one needs Unicode properties as well, but is close enough.
However, looks like Arial Unicode as shipped on WinXP, does have GDEF
glyph class array, but defines no classes for Hebrew. This results
in Hebrew marks not getting their widths zeroed. So, with this change,
we synthesize glyph class for any glyph that is not specified in the
GDEF glyph class table. Since, from our point of view, a glyph not
being listed in that table is a font bug, any unwanted consequence of
this change is a font bug :).
Note that we still don't get the same rendering as Uniscribe, since
Uniscribe seems to do fallback positioning as well, even though the
font does have a GPOS table (which does NOT cover Hebrew!). We are
not going to try to match that though.
Test string for Arial Unicode:
U+05E9,U+05B8,U+05C1,U+05DC
Before: [gid1166=3+991|gid1142=0+737|gid5798=0+1434]
After: [gid1166=3+991|gid1142=0+0|gid5798=0+1434]
Uniscribe: [gid1166=3+991|gid1142=0@348,0+0|gid5798=0+1434]
Note that our new output matches what we were generating until July
2014, because the Hebrew shaper used to zero mark advances based on
Unicode, NOT GDEF. That's 9e834e29e0.
Reported by Greg Douglas.
API changes:
- If NDEBUG is defined, define HB_NDEBUG
- Disable costlier sanity checks if HB_NDEBUG is defined.
In 1.2.3 introduced some code to disable costly sanity checks if
NDEBUG is defined. NDEBUG, however, disables all assert()s as
well. With HB_NDEBUG, one can disable costlier checks but keep
assert()s.
I'll probably add a way to define HB_NDEBUG automatically in
release tarballs. But for now, production systems that do NOT
define NDEBUG, are encouraged to define HB_NDEBUG for our build.
New API:
- hb_font_get_nominal_glyph_func_t
- hb_font_get_variation_glyph_func_t
- hb_font_funcs_set_nominal_glyph_func()
- hb_font_funcs_set_variation_glyph_func()
- hb_font_get_nominal_glyph()
- hb_font_get_variation_glyph()
Deprecated API:
- hb_font_get_glyph_func_t
- hb_font_funcs_set_glyph_func()
Clients that implement their own font-funcs are encouraged to replace
their get_glyph() implementation with a get_nominal_glyph() and
get_variation_glyph() pair. The variation version can assume that
variation_selector argument is not zero.
That commit moved the advance adjustment for mark positioning to
be applied immediately, instead of doing late before. This breaks
if mark advances are zeroed late, like in Arabic. Also, easier to
hit it in RTL scripts since a single mark with non-zero advance is
enough to hit the bug, whereas in LTR, at least two marks are needed.
This reopens https://github.com/behdad/harfbuzz/issues/211
The cursive+mark interaction is broken again. To be fixed in a
different way.
This better emulates Unicode grapheme clusters.
Note that Uniscribe does NOT do this, but should be harmless with most clients,
and improve fallback with clients that use HarfBuzz cluster as unit of fallback.
Fixes https://github.com/behdad/harfbuzz/issues/217
Fixes https://github.com/behdad/harfbuzz/issues/223
Right now we cannot test this because it has to be tested using hb-fuzzer.
We should move all fuzzing tests from test/shaping/tests/fuzzed.tests to
test/fuzzing/ and have its own test runner. At that point, should add
test from this issue as well.
This is what Microsoft's implementation does. Marks that need advance
need to add it back using 'dist' or other feature in GPOS. Update tests to
match.
Fixes https://github.com/behdad/harfbuzz/issues/211
What happens in that bug is that a mark is attached to base first,
then a second mark is cursive-chained to the first mark. This only
"works" because it's in the Indic shaper where mark advances are
not zeroed.
Before, we didn't allow cursive to run on marks at all. Fix that.
We also where updating mark major offsets at the end of GPOS, such
that changes in advance of base will not change the mark attachment
position. That was superior to the alternative (which is what Uniscribe
does BTW), but made it hard to apply cursive to the mark after it
was positioned. We could track major-direction offset changes and
apply that to cursive in the post process, but that's a much trickier
thing to do than the fix here, which is to immediately apply the
major-direction advance-width offsets... Ie.:
https://github.com/behdad/harfbuzz/issues/211#issuecomment-183194739
If this breaks any fonts, the font should be fixed to do mark attachment
after all the advances are set up first (kerning, etc).
Finally, this, still doesn't make us match Uniscribe, for I explained
in that bug. Looks like Uniscribe applies minor-direction cursive
adjustment immediate as well. We don't, and we like it our way, at
least for now. Eg. the sequence in the test case does this:
- The first subscript attaches with mark-to-base, moving in x only,
- The second subscript attaches with cursive attachment to first subscript
moving in x only,
- A final context rule moves the first subscript up by 104 units.
The way we do, the final shift-up, also shifts up the second subscript
mark because it's cursively-attached. Uniscribe doesn't. We get:
[ttaorya=0+1307|casubscriptorya=0@-242,104+-231|casubscriptnarroworya=0@20,104+507]
while Uniscribe gets:
[ttaorya=0+1307|casubscriptorya=0@-242,104+-211|casubscriptnarroworya=0+487]
note the different y-offset of the last glyph. In our view, after cursive,
things move together, period.
This moves all the source listings in src/Makefile.am,
src/hb-ucdn/Makefile.am and util/Makefile.am into separate Makefile
snippets, so that they may be shared between different Makefile-based
build systems, such as NMake for Visual Studio.
This speeds up shaping the Amiri font by over 15%.
This was primarily needed for my work on OpenType GX, since
we will be collecting only sublookups that are "active" for
current font instance; but it's a nice boost in general as
well.
We might, in the future, collect subtables in the lookup_accel.
That would also allow us to do a per-subtbale set-digest, which
should speed things up some more, specially for ContextChainFormat3
lookups... Amiri, for example, contains one lookup with 53
subtables!
The font Garamond Premier Pro Caption (and possibly many other
Adobe fonts), have many FeatureParamsSize tables with the old
wrong offset. We handle fixing those up, but they were still
contributing to edit_count, and when I reduced HB_SANITIZE_MAX_EDIT
from 100 to 8 in 14c2de3218, these
fonts were now getting GPOS dropped and hence kerning disabled.
Fix, by not counting edits made towareds offset fix-up. I'll
also increase edit count again, in the next commit.
The coretext_aat shaper delegates to the regular coretext_..._ensure() functions, so coretext_aat_..._ensure() functions defined by these macros are unused. The compiler warns about them, which in turn can confuse people to think that the coretext_aat_..._ensure() functions weren't called by accident.
Currently just announces lookup applications. Message-API *will* change.
hb-shape / hb-view are updated to print-out messages to stder if --debug
is specified.
We regeressed Malayalam in 508cc3d3cf
This brings down the failures to 198 (from 750).
BENGALI: 353725 out of 354188 tests passed. 463 failed (0.130722%)
DEVANAGARI: 707307 out of 707394 tests passed. 87 failed (0.0122987%)
GUJARATI: 366349 out of 366457 tests passed. 108 failed (0.0294714%)
GURMUKHI: 60732 out of 60747 tests passed. 15 failed (0.0246926%)
KANNADA: 951190 out of 951913 tests passed. 723 failed (0.0759523%)
KHMER: 299070 out of 299124 tests passed. 54 failed (0.0180527%)
MALAYALAM: 1048136 out of 1048334 tests passed. 198 failed (0.0188871%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271662 out of 271847 tests passed. 185 failed (0.068053%)
TAMIL: 1091753 out of 1091754 tests passed. 1 failed (9.15957e-05%)
TELUGU: 970555 out of 970573 tests passed. 18 failed (0.00185457%)
MYANMAR: 1123865 out of 1123883 tests passed. 18 failed (0.00160159%)
Test stats remain unchanged, except for Malayalam, which we investigate:
BENGALI: 353725 out of 354188 tests passed. 463 failed (0.130722%)
DEVANAGARI: 707307 out of 707394 tests passed. 87 failed (0.0122987%)
GUJARATI: 366349 out of 366457 tests passed. 108 failed (0.0294714%)
GURMUKHI: 60732 out of 60747 tests passed. 15 failed (0.0246926%)
KANNADA: 951190 out of 951913 tests passed. 723 failed (0.0759523%)
KHMER: 299070 out of 299124 tests passed. 54 failed (0.0180527%)
MALAYALAM: 1047584 out of 1048334 tests passed. 750 failed (0.0715421%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271662 out of 271847 tests passed. 185 failed (0.068053%)
TAMIL: 1091753 out of 1091754 tests passed. 1 failed (9.15957e-05%)
TELUGU: 970555 out of 970573 tests passed. 18 failed (0.00185457%)
Myanmar, compared to Windows 10 mmrtext.ttf:
MYANMAR: 1123865 out of 1123883 tests passed. 18 failed (0.00160159%)
On Windows 10 we are seeing that other error message...
Test sequence: U+0995,U+-9CD,U+09B0
With Nirmala shipped on Windows 10, this failed to form the below form.
Works now.
Reported by Sairus.
Say, if we are ligating "A B_C m D", then previously 'm' was being
attached to 'B' in the combined A_B_C_D ligature. Now we attach it
to 'C'. No test for this though :(.
We use three bits for lig_id these days, so we finally got a report of
two separate ligatures with the same lig_id happening adjacent to each
other, and then the component-handling code was breaking things.
Protect against that by ignoring same-lig-id but lig-comp=0 glyphs after
a new ligature.
Fixes https://github.com/behdad/harfbuzz/issues/198
Before, we were just checking the use_category(). This detects as
halant a ligature that had the halant as first glyph (as seen in
NotoSansBalinese.) Change that to use the is_ligated() glyph prop
bit. The font is forming this ligature in ccmp, which is before
the rphf / pref tests. So we need to make sure the "ligated" bit
survives those tests. Since those only check the "substituted" bit,
we now only clear that bit for them and "ligated" survives.
Fixes https://github.com/behdad/harfbuzz/issues/180
- use '-w' instead of '\<...\>' for check-header-guards
grep manpage says these are the same
- put '-q' first in the grep options
- move VAR into hb-private.hh
- hb-font-private.hh - use [VAR] instead of [] for variable array
Back in the old days, we used to apply 'calt' and 'cswh' in Arabic shaper,
with a pause in between. Then we disabled the 'cswh' because Microsoft
disabled it, but forgot to remove the unnecessary pause. Do that now.
This has the benefit that it fixes shaping with monbaiti from Windows 10.
In that version of that font, the lookups from 'calt' are duplicated in
'rclt', and Mongolian was changed to go through Universal Shaping Engine.
We still use the Arabic shaper for Mongolian. With a pause after 'calt',
we were applying the duplicate lookups from 'calt' and 'rclt' twice. It
happened to be the case that these lookups were NOT idempotent. So we
were getting wrong shaping. See thread "Windows 10 monbaiti.ttf upgrade
(5.01 -> 5.51) caused loss of diacritical marks when shaped with harfbuz"
on the mailing list. This fixes that.
This aims to make Syriac Abbr Mark sizing more accurate when repeating segments are used, by adding an extra repeat and tightening up the spacing slightly rather than leaving a shortfall corresponding to a partial repeat-width.
This was brorken earlier, though, it's really hard to notice it.
Unlike the glyph_h_origin(), an unset glyph_v_origin() does NOT
mean that the vertical origin is at 0,0.
Related to https://github.com/behdad/harfbuzz/issues/187