This reverts commit 24dd4e5674.
Oops. My bad. The change _regressed_ Malayalam test suite, not
improved it. I'll redo it, differentiating between old-spec and
new-spec cases.
The MS Indic specs say "...all classifications are determined ... using
context-free substitutions." However, testing shows that MS's Malayalam
shapers (both old and new), "match" even if there is no zero-context rule.
We follow.
Fixes below-base La (eg. Pa,H,La) with AnjaliNewLipi.ttf (old spec).
Moreover, test suite Malayalam failures are down to 312 from 875! No
change in other scripts.
Current numbers:
BENGALI: 353996 out of 354285 tests passed. 289 failed (0.0815727%)
DEVANAGARI: 707339 out of 707394 tests passed. 55 failed (0.00777502%)
GUJARATI: 366489 out of 366506 tests passed. 17 failed (0.0046384%)
GURMUKHI: 60769 out of 60809 tests passed. 40 failed (0.0657797%)
KANNADA: 951086 out of 951913 tests passed. 827 failed (0.0868777%)
KHMER: 299106 out of 299124 tests passed. 18 failed (0.00601757%)
LAO: 53611 out of 53644 tests passed. 33 failed (0.0615167%)
MALAYALAM: 1047541 out of 1048416 tests passed. 875 failed (0.0834592%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271726 out of 271847 tests passed. 121 failed (0.0445103%)
TAMIL: 1091837 out of 1091837 tests passed. 0 failed (0%)
TELUGU: 970558 out of 970573 tests passed. 15 failed (0.00154548%)
TIBETAN: 208469 out of 208469 tests passed. 0 failed (0%)
Free up syllables and let features work across syllables for the
presentation forms features and GPOS.
Fixed:
- 1 GURMUKHI test (remains 40)
- 12 KHMER tests (remains 18)
- 11 SINHALA tests (remains 121)
Regresses:
- 5 MALAYALAM tests (up to 312)
Current numbers:
BENGALI: 353996 out of 354285 tests passed. 289 failed (0.0815727%)
DEVANAGARI: 707339 out of 707394 tests passed. 55 failed (0.00777502%)
GUJARATI: 366489 out of 366506 tests passed. 17 failed (0.0046384%)
GURMUKHI: 60769 out of 60809 tests passed. 40 failed (0.0657797%)
KANNADA: 951086 out of 951913 tests passed. 827 failed (0.0868777%)
KHMER: 299106 out of 299124 tests passed. 18 failed (0.00601757%)
LAO: 53611 out of 53644 tests passed. 33 failed (0.0615167%)
MALAYALAM: 1048104 out of 1048416 tests passed. 312 failed (0.0297592%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271726 out of 271847 tests passed. 121 failed (0.0445103%)
TAMIL: 1091837 out of 1091837 tests passed. 0 failed (0%)
TELUGU: 970558 out of 970573 tests passed. 15 failed (0.00154548%)
TIBETAN: 208469 out of 208469 tests passed. 0 failed (0%)
The merger of normalizer and glyph-mapping broke shapers that
modified text stream. Unbreak them by adding a new preprocess_text
shaping stage that happens before normalizing/cmap and disallow
setup_mask modification of actual text.
The change is very subtle. If we have a single-char cluster that
decomposes to three or more characters, then try recomposition, in
case the farther mark may compose with the base.
Essentially move the glyph mapping to normalization process.
The effect on Devanagari is small (but observable). Should be more
observable in simple text, like ASCII.
This reverts commit 0981068b75.
I was confused. Even if we access coverage[0] unconditionally, we don't
need bound checks since the array machinary already handles that.
Apparently even that doesn't make check-internal-symbols.sh happy with
mingw32. Going to disable that for DLLs again, but hopefully the
export-file is doing *something*.