After the Ngapi hackfest work, we were assuming that fonts
won't use presentation features to choose specific forms
(eg. conjuncts). As such, we were using auto-joiner behavior
for such features. It proved to be troublesome as many fonts
used presentation forms ('pres') for example to form conjuncts,
which need to be disabled when a ZWJ is inserted.
Two examples:
U+0D2F,U+200D,U+0D4D,U+0D2F with kartika.ttf
U+0995,U+09CD,U+200D,U+09B7 with vrinda.ttf
What we do now is to never do magic to ZWJ during GSUB's main input
match for Indic-style shapers. Note that backtrack/lookahead are still
matched liberally, as is GPOS. This seems to be an acceptable
compromise.
As to the bug that initially started this work, that one needs to
be fixed differently:
Bug 58714 - Kannada u+0cb0 u+200d u+0ccd u+0c95 u+0cbe does not
provide same results as Windows8
https://bugs.freedesktop.org/show_bug.cgi?id=58714
New numbers:
BENGALI: 353689 out of 354188 tests passed. 499 failed (0.140886%)
DEVANAGARI: 707305 out of 707394 tests passed. 89 failed (0.0125814%)
GUJARATI: 366349 out of 366457 tests passed. 108 failed (0.0294714%)
GURMUKHI: 60706 out of 60747 tests passed. 41 failed (0.067493%)
KANNADA: 951030 out of 951913 tests passed. 883 failed (0.0927606%)
KHMER: 299070 out of 299124 tests passed. 54 failed (0.0180527%)
LAO: 53611 out of 53644 tests passed. 33 failed (0.0615167%)
MALAYALAM: 1048102 out of 1048334 tests passed. 232 failed (0.0221304%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271666 out of 271847 tests passed. 181 failed (0.0665816%)
TAMIL: 1091753 out of 1091754 tests passed. 1 failed (9.15957e-05%)
TELUGU: 970555 out of 970573 tests passed. 18 failed (0.00185457%)
TIBETAN: 208469 out of 208469 tests passed. 0 failed (0%)
Ouch, how did things ever work without this?! The added test that has a
dot-reph as well as a pre-base reordering Ra perfectly demonstrates the
bug (tested with Nirmala font from Win8 for example). Testing suggests
that Win8 shaper has the *exact* same bug / behavior that we used to
have. Odd.
Uniscribe is buggy and sometimes /eats/ a mark next to a non-joiner.
Most of Malayalam failures where actually hitting this bug.
Ignore test output with two zero-width space glyphs. This is a hack
until we build up the test suite infrastructure better.
Bengali went down by 9, Devanagari by 2, Kannada by 130, Malayalm down
from 1197 to 307, Sinhala down by 16, Telugu down by 26. New stats:
BENGALI: 353996 out of 354285 tests passed. 289 failed (0.0815727%)
DEVANAGARI: 693573 out of 693628 tests passed. 55 failed (0.00792932%)
GUJARATI: 366489 out of 366506 tests passed. 17 failed (0.0046384%)
GURMUKHI: 60750 out of 60809 tests passed. 59 failed (0.0970251%)
KANNADA: 951086 out of 951913 tests passed. 827 failed (0.0868777%)
KHMER: 299094 out of 299124 tests passed. 30 failed (0.0100293%)
MALAYALAM: 1048109 out of 1048416 tests passed. 307 failed (0.0292823%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271715 out of 271847 tests passed. 132 failed (0.0485567%)
TAMIL: 1091837 out of 1091837 tests passed. 0 failed (0%)
TELUGU: 970550 out of 970573 tests passed. 23 failed (0.00236973%)
Just put it before base, which is what's expected.
Malayalam failures down from 1559 to 1197 (0.114172%).
BENGALI: 353988 out of 354285 tests passed. 297 failed (0.0838308%)
DEVANAGARI: 693571 out of 693628 tests passed. 57 failed (0.00821766%)
GUJARATI: 366489 out of 366506 tests passed. 17 failed (0.0046384%)
GURMUKHI: 60750 out of 60809 tests passed. 59 failed (0.0970251%)
KANNADA: 950956 out of 951913 tests passed. 957 failed (0.100534%)
KHMER: 299094 out of 299124 tests passed. 30 failed (0.0100293%)
MALAYALAM: 1047219 out of 1048416 tests passed. 1197 failed (0.114172%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271699 out of 271847 tests passed. 148 failed (0.0544424%)
TAMIL: 1091837 out of 1091837 tests passed. 0 failed (0%)
TELUGU: 970524 out of 970573 tests passed. 49 failed (0.00504856%)