After the Ngapi hackfest work, we were assuming that fonts
won't use presentation features to choose specific forms
(eg. conjuncts). As such, we were using auto-joiner behavior
for such features. It proved to be troublesome as many fonts
used presentation forms ('pres') for example to form conjuncts,
which need to be disabled when a ZWJ is inserted.
Two examples:
U+0D2F,U+200D,U+0D4D,U+0D2F with kartika.ttf
U+0995,U+09CD,U+200D,U+09B7 with vrinda.ttf
What we do now is to never do magic to ZWJ during GSUB's main input
match for Indic-style shapers. Note that backtrack/lookahead are still
matched liberally, as is GPOS. This seems to be an acceptable
compromise.
As to the bug that initially started this work, that one needs to
be fixed differently:
Bug 58714 - Kannada u+0cb0 u+200d u+0ccd u+0c95 u+0cbe does not
provide same results as Windows8
https://bugs.freedesktop.org/show_bug.cgi?id=58714
New numbers:
BENGALI: 353689 out of 354188 tests passed. 499 failed (0.140886%)
DEVANAGARI: 707305 out of 707394 tests passed. 89 failed (0.0125814%)
GUJARATI: 366349 out of 366457 tests passed. 108 failed (0.0294714%)
GURMUKHI: 60706 out of 60747 tests passed. 41 failed (0.067493%)
KANNADA: 951030 out of 951913 tests passed. 883 failed (0.0927606%)
KHMER: 299070 out of 299124 tests passed. 54 failed (0.0180527%)
LAO: 53611 out of 53644 tests passed. 33 failed (0.0615167%)
MALAYALAM: 1048102 out of 1048334 tests passed. 232 failed (0.0221304%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271666 out of 271847 tests passed. 181 failed (0.0665816%)
TAMIL: 1091753 out of 1091754 tests passed. 1 failed (9.15957e-05%)
TELUGU: 970555 out of 970573 tests passed. 18 failed (0.00185457%)
TIBETAN: 208469 out of 208469 tests passed. 0 failed (0%)
Ubuntu doesn't (or didn't until recently?) ship icu pkg-config
files. That's quite unfortunate. Work around it.
Bug 57608 - ICU Detection fallback for non-pkgconfig systems
That flag is redundant, deprecated, and ignored since April 2011.
From FreeType git log:
commit 8c82ec5b17d0cfc9b0876a2d848acc207a62a25a
Author: Behdad Esfahbod <behdad@behdad.org>
Date: Thu Apr 21 08:21:37 2011 +0200
Always ignore global advance.
This makes FT_LOAD_IGNORE_GLOBAL_ADVANCE_WIDTH redundant,
deprecated, and ignored. The new behavior is what every major user
of FreeType has been requesting. Global advance is broken in many
CJK fonts. Just ignoring it by default makes most sense.
* src/truetype/ttdriver.c (tt_get_advances),
src/truetype/ttgload.c (TT_Get_HMetrics, TT_Get_VMetrics,
tt_get_metrics, compute_glyph_metrics, TT_Load_Glyph),
src/truetype/ttgload.h: Implement it.
* docs/CHANGES: Updated.
I added these because the older mingw32 toolchain didn't have
MemoryBarrier(). The newer mingw-w64 toolchain however has.
As reported by John Emmas this was causing build failure with
MSVC (on glib) because of inline issues. But that reminded me
that we may be taking this path even if the system implements
MemoryBarrier as a function, which is a waste. So, just remove
it.
Before, we were marking them as below-form for initial reordering.
However, there is a rule that says "post consonants should follow
below consonsnts" for base determination purposes. Malayalam has
port-form YA/VA, and RA is pre-base. As such, for a sequence like
YA,Virama,YA,Virama,RA, the correct base is at index 0. But
because the code was seeing RA as a below-base, it was stopping at
the second YA as base, instead of jumping it as a post-base.
By treating prebase-reordering consonants like post-forms, this
is fixed.
MALAYALAM went down from 351 to 265. Other numbers didn't change:
BENGALI: 353686 out of 354188 tests passed. 502 failed (0.141733%)
DEVANAGARI: 707305 out of 707394 tests passed. 89 failed (0.0125814%)
GUJARATI: 366262 out of 366457 tests passed. 195 failed (0.0532122%)
GURMUKHI: 60706 out of 60747 tests passed. 41 failed (0.067493%)
KANNADA: 950680 out of 951913 tests passed. 1233 failed (0.129529%)
KHMER: 299074 out of 299124 tests passed. 50 failed (0.0167155%)
LAO: 53611 out of 53644 tests passed. 33 failed (0.0615167%)
MALAYALAM: 1048069 out of 1048334 tests passed. 265 failed (0.0252782%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271539 out of 271847 tests passed. 308 failed (0.113299%)
TAMIL: 1091753 out of 1091754 tests passed. 1 failed (9.15957e-05%)
TELUGU: 970555 out of 970573 tests passed. 18 failed (0.00185457%)
TIBETAN: 208469 out of 208469 tests passed. 0 failed (0%)
Such fonts are *definitely* really broken. Give up.
Limits time spent in sanitize for extremely / deliberately broken
fonts. For example, two fonts with these md5sum / names:
9343f0a1b8c84b8123e7d201cae62ffd.ttf
eb8c978547f09d368fc204194fb34688.ttf
were spending over a second in sanitize! Not anymore.
This fixes a design bug with sanitize and sub-blobs that can
cause crashes. Jonathan and I found and debugged this issue
when we tested a corrupt font with the md5sum / filename:
ea395483d37af0cb933f40689ff7b60a. Two hours of intense
debugging we found out that the font has overlapping GSUB/GPOS
tables, and as such, sanitizing the second table can modify
the first one, which can cause all kinds of undefined behavior.
The correct way to fix this is to make sure sub-blobs are
always created readonly, since we consider the parent blob
to be a shared resource and can't modify it, even if it *is*
writable.
This essentially makes the READONLY_MAY_MAKE_WRITABLE mode
unused... Maybe we should simply remove / deprecate it.