Behdad Esfahbod
4a7f4f3e56
[Thai] Adjust SARA AM reordering to match Uniscribe
...
Adjust the list of marks before SARA AM that get the reordering
treatment. Also adjust cluster formation to match Uniscribe.
With Wikipedia test data, now I see:
- For Thai, with the Angsana New font from Win7, I see 54 failures out
of over 4M tests (0.00129107%). Of the 54, two are legitimate
reordering issues (fix coming soon), and the other 52 are simply
Uniscribe using a zero-width space char instead of an unknown
character for missing glyphs. No idea why. The missing-glyph
sequences include one that is a Thai character followed by an Arabic
Sokun. Someone confused it with Nikhahit I assume!
- For Lao, with the Dokchampa font from Win7, 33 tests fail out of
54k (0.0615167%). All seem to be insignificant mark positioning
with two marks on a base. Have to investigate.
2012-07-23 13:15:33 -04:00
Behdad Esfahbod
2cc933aff9
[Indic] Fix cluster formation with left-matras and conjunct forms
...
Test case was: <U+0D15,U+0D4D,U+0D15,U+0D4A>.
2012-07-23 08:23:44 -04:00
Behdad Esfahbod
e6b01a878c
[Indic] Further streamline cluster formation
...
This should address all possible cluster misformations that I had in
mind.
2012-07-23 00:11:26 -04:00
Behdad Esfahbod
7b2a7dadd6
[Indic] Merge clusters before sorting
...
This should fix any instabilities in cluster formation that we were
speculating may happen with surrounding syllables. Or most of it
perhaps.
2012-07-22 23:58:55 -04:00
Behdad Esfahbod
abb3239ef9
[Indic] Update clusters for left-matra even if matra didn't move
...
Fixes crashes reported with left matra under
non-uniscribe-bug-compatibilty mode.
2012-07-22 23:55:19 -04:00
Behdad Esfahbod
60554f14d8
[Indic] Merge in Malayalam tests
...
From:
http://silpa.org.in/pub/tests/hb/ml/ml-harfbuzz-testdata.txt
2012-07-22 23:23:56 -04:00
Behdad Esfahbod
5c7081770c
[Indic] Add extensive Sinhala tests
...
Generated by:
http://git.savannah.gnu.org/cgit/sinhala.git/plain/utils/gen-unicode-sinhala.py
2012-07-22 23:20:27 -04:00
Behdad Esfahbod
2efe4707b1
[Indic] Add Sinhala tests
...
Merge tests from:
http://git.savannah.gnu.org/cgit/sinhala.git/plain/patches/icu-sinhala-rendering.txt
2012-07-22 23:17:59 -04:00
Behdad Esfahbod
3d4c111b7a
Add a test case
2012-07-20 19:34:39 -04:00
Behdad Esfahbod
92a1ad7bef
[Indic] Stop searching for base if a post form is found before below form
...
Improves Bengali and Gurmukhi. Malayalam regressed a bit. We will deal
with that later.
2012-07-20 18:55:15 -04:00
Behdad Esfahbod
4c450c703f
[Indic] Recompose Bengali Ya,Nukta
...
This is a bunch of hacks for now.
Improves Bengali a bit.
2012-07-20 18:13:04 -04:00
Behdad Esfahbod
e9c0f152a3
[Uniscribe] Fix script fallback
...
Gurmukhi failures half now. Others changed slightly.
2012-07-20 17:37:48 -04:00
Behdad Esfahbod
5791f32915
[Indic] Allow a ZWNJ after SM's
...
Malayalam failures go way down. Other scripts benefitted slightly too.
Sinhala had one or two test regressions, but...
2012-07-20 16:26:55 -04:00
Behdad Esfahbod
34ae336f3f
[Indic] Improve Reph AfterMain positioning
...
Fixes 20 out of 48 failing Oriya tests. Failure rate down to 0.066% now.
2012-07-20 16:17:28 -04:00
Behdad Esfahbod
bdd080431a
[Indic] Reposition Oriya Candrabindu
...
Oriya failures down from 0.65% to 0.20%.
2012-07-20 16:03:09 -04:00
Behdad Esfahbod
5f0eaaad12
[Indic] Fix base search in final_reordering
...
Fixes most Malayalam failures. Down from 1.6% to 0.38% now. Fixes a
few more in other scripts too.
2012-07-20 15:47:24 -04:00
Behdad Esfahbod
81202bd860
[Indic] Don't attach SM/VD to other characters
2012-07-20 15:14:51 -04:00
Behdad Esfahbod
efb4ad7356
Fix compiler warnings
...
If x is not constant, we cannot ASSERT_STATIC on it.
2012-07-20 14:27:38 -04:00
Behdad Esfahbod
f31d97e44e
[Indic] Form Telugu Reph out of Ra,Virama,ZWJ
...
Apparently this was approved in Feb 2012. No font yet.
2012-07-20 14:13:35 -04:00
Behdad Esfahbod
2e193b240e
[Indic] Don't split U+0AC9
...
Althought IndicMatraCategory.txt classifies it as Top_And_Right matra,
it does not have Unicode decomposition, and Uniscribe does not do
anything special about it either.
Gujarati failures down from 0.672% to 0.0130966%.
2012-07-20 14:02:35 -04:00
Behdad Esfahbod
30c3d5e9fc
[Indic] Simplify Uniscribe cluster emulation
...
Now that we break syllables on Halant,ZWNJ, this code can be simplified.
2012-07-20 13:56:32 -04:00
Behdad Esfahbod
decf6ffca4
[Indic] Minor!
2012-07-20 13:51:31 -04:00
Behdad Esfahbod
9e4f94a72c
[Indic] Break syllables at Halant,ZWNJ
...
That's really what Uniscribe does, and explains a lot of pecularities of
Halant,ZWNJ before the base.
Sent Telugu from 1% failures to 0.03%. Improved Kannada and Malayalam
slightly. Fixed half of Bengali, and did NOT break anything!
2012-07-20 13:48:03 -04:00
Behdad Esfahbod
2c372b80f6
[Indic] Better check for applying 'init'
...
Specifically, don't apply 'init' if previous char is a joiner.
Fixes some more of Bengali.
2012-07-20 13:37:48 -04:00
Behdad Esfahbod
34a7440b7c
[GPOS] Don't zero mark advances
...
Fixes more of Telugu, Kannada, and Oriya.
May break things (outside Indic...), but we cannot think of any font relying
on this immediately.
2012-07-20 12:40:39 -04:00
Behdad Esfahbod
8ed248de77
[Indic] Minor
2012-07-20 11:42:24 -04:00
Behdad Esfahbod
d0e68dbd0b
[Indic] Implement reph positioning step 5
...
Not tuned, just copied from step 2. Fixes another 0.5% of Kannada
failures. 1% to go.
2012-07-20 11:25:41 -04:00
Behdad Esfahbod
a9e45c32e4
[Indic] Don't let ZWNJ at the end of syllable affect base search
...
Fixes a few Devanagari, half of remaining Kannada failures, quarter for
Telugu, and others slightly improved or unchanged.
2012-07-20 11:04:15 -04:00
Behdad Esfahbod
20b68e699f
[Indic] Apply 'cjct' globally
...
Fixes 5 Devanagari failures, and no regressions.
2012-07-20 10:47:46 -04:00
Behdad Esfahbod
51e764de44
[Indic] Unbreak old scriptures
...
Brings down failures with Lohit-Telugu from 57% to 1.40%.
2012-07-20 10:30:24 -04:00
Behdad Esfahbod
900cf3d449
Minor
2012-07-20 10:18:23 -04:00
Behdad Esfahbod
87cd63266e
[Indic] Recategorize some Kannada right matras
...
Kannada failures down from 3.5% to 2.93%.
2012-07-19 21:25:46 -04:00
Behdad Esfahbod
3604d64ced
[Indic] Recategorize GURMUKHI ADDAK
...
It's not in IndicSyllabicCategory.txt. Fixes most of Gurmukhi failures.
Failures down from 7.7% to 0.222%!
2012-07-19 21:13:04 -04:00
Behdad Esfahbod
8932858123
Minor
2012-07-19 21:02:38 -04:00
Behdad Esfahbod
47ef931f13
[buffer] Make sure out_info = info during GPOS
2012-07-19 20:52:44 -04:00
Behdad Esfahbod
ae63cf2062
Print line number during return when tracing
2012-07-19 20:45:41 -04:00
Behdad Esfahbod
5249f3aee1
[Indic] Unbreak Khmer
...
For Khmer, all consonants are subjoining. No need to look in the font.
We were looking in the wrong order anyway.
2012-07-19 20:30:22 -04:00
Behdad Esfahbod
e0475345d5
[Indic] Apply 'akhn' globally
...
Fixes 1.5% more failures for Telugu, 2% for Kannada.
Breaks one test in Devanagari.
2012-07-19 20:24:14 -04:00
Behdad Esfahbod
c87bcddb10
[Indic] Add failing test for Kannada
2012-07-19 20:03:25 -04:00
Behdad Esfahbod
fa247ebe52
[Indic] Better position U+0CD5
...
Fixes another 5% of Kannada failures.
2012-07-19 19:52:19 -04:00
Behdad Esfahbod
f055442716
[Indic] Lookup consonant position in the font
...
Fixes most failures of Oriya, and improves others a bit.
2012-07-19 16:20:21 -04:00
Behdad Esfahbod
74d1d88781
[GSUB] Fix would_apply() for LigatureSubst
2012-07-19 16:14:23 -04:00
Behdad Esfahbod
787f7d1e9b
[TODO] Minor
2012-07-19 15:29:13 -04:00
Behdad Esfahbod
be73a5f936
Add src/test-would-substitute tool
2012-07-19 15:12:18 -04:00
Behdad Esfahbod
e72b360ac6
Refactor / finish would_apply() operation
...
Untested.
2012-07-19 14:44:46 -04:00
Behdad Esfahbod
8c973ebf0f
[Indic] Implement per-script matra positioning
...
Following what the spec says.
Brings down Telugu failures from 40% to 3.75%, and Kannada failures from
44% to 10%. Does NOT affect other scripts' test results.
2012-07-19 13:25:08 -04:00
Behdad Esfahbod
8bb32458f9
[Indic] More refactoring
2012-07-19 13:04:44 -04:00
Behdad Esfahbod
9ccc6382ba
[Indic] Minor refactoring
2012-07-19 12:45:31 -04:00
Behdad Esfahbod
f83aaa3133
[Indic] Minor
2012-07-19 12:23:23 -04:00
Behdad Esfahbod
be8b9f5f71
[Indic] Start refactoring different matra positions per script
2012-07-19 12:11:12 -04:00