Commit Graph

5704 Commits

Author SHA1 Message Date
Behdad Esfahbod 21d2803133 [Indic] Do clustering like Uniscribe does
Hindi Wikipedia failures down to 6639 (0.938381%)!
2012-05-10 18:34:34 +02:00
Behdad Esfahbod b20c9ebaf5 [Indic] Add test for matra group
The spec says: "[{M}+[N]+[H]]", and that's what Uniscribe implements.
We instead do: "{M+[N]+[H]}", which means we allow Nukta and Halant
after all Matras, not just the last one.  It makes more sense.
2012-05-10 18:31:17 +02:00
Behdad Esfahbod 8df5636968 [Indic] Reorder Reph to before the Halant after Matras
Uniscribe doesn't do it, but we want to do as it gives the Reph the
opportunity to interact with the Matras.  Test with mangal for example.
Sequence: <0930,094d,0915,094b,094d>
In test suite already.
2012-05-10 15:41:04 +02:00
Behdad Esfahbod daf3234bdc [Indic] Don't clear the mask for Reph
This was removing the mandatory global 1 bit in the mask and hence
disabling GPOS for Reph!
2012-05-10 15:28:27 +02:00
Behdad Esfahbod 7708ee23cb [Indic] Improve Left Matra repositioning
Move its dependents too.
2012-05-10 14:48:25 +02:00
Behdad Esfahbod 61a58e26a5 [Indic] Add tricky reordering test cases
In the case of Consonant,LeftMatra,Halant, Uniscribe leaves the Halant
where it is, but we want to move it with the Matra as that makes more
logical sense.
2012-05-10 14:43:53 +02:00
Behdad Esfahbod dbb105883c [Indic] Do Reph repositioning in final reordering like the spec says
This introduced a failure, which we tracked down to a test case like this:

  U+092E,U+094B,U+094D,U+0930

The final character is a Ra that should be put in a syllable of it's
own.  And we do.  But it will interact with the Halant before it.  So
now we finally are convinced that we have to limit features to syllable
boundaries.  That's coming after lunch!
2012-05-10 13:45:52 +02:00
Behdad Esfahbod 4705a70269 Minor 2012-05-10 13:09:08 +02:00
Behdad Esfahbod 4ac9e98d9d [Indic] Reorder left matras to be closer to base 2012-05-10 12:53:53 +02:00
Behdad Esfahbod 1a1fa8c655 [Indic] Treat the standalone cluster case reusing the consonant logic 2012-05-10 12:21:30 +02:00
Behdad Esfahbod 190eb31a16 [Indic] Minor 2012-05-10 12:21:30 +02:00
Behdad Esfahbod c5306b6861 [Indic] Handle Vowel syllables
Reusing the consonant logic!
2012-05-10 12:21:30 +02:00
Behdad Esfahbod 6d8e0cb74c [Indic] Simplify Reph logic 2012-05-10 11:41:51 +02:00
Behdad Esfahbod 3d25079f8d [Indic] Don't form Reph is Ra is the only consonant in the syllable 2012-05-10 11:37:42 +02:00
Behdad Esfahbod b99d63ae11 [Indic] Increase max syllable length
20 was way too low, one could hit a syllable with 7ish consonants with it.
2012-05-10 11:32:52 +02:00
Behdad Esfahbod a391ff50b9 [Indic] Adjust base after sorting 2012-05-10 11:31:20 +02:00
Behdad Esfahbod d3637edb24 [Indic] Don't return for long syllables. Just not sort. 2012-05-10 10:51:38 +02:00
Behdad Esfahbod dfa0cade7f Fix Uniscribe clusters with multiple items 2012-05-09 19:10:07 +02:00
Behdad Esfahbod 86e5dd386a [Indic] Don't give up syllable parsing upon junk 2012-05-09 18:57:37 +02:00
Behdad Esfahbod ef24cc8c8e [Indic] Towards multi-cluster syllables and final reordering 2012-05-09 18:10:20 +02:00
Behdad Esfahbod a9844d41c6 Combine lig_id and lig_comp into one byte, to free up one for Indic 2012-05-09 17:53:13 +02:00
Behdad Esfahbod 92332e5116 Minor 2012-05-09 17:40:00 +02:00
Behdad Esfahbod dbccf87eef [Indic] Make room for more reordering positions 2012-05-09 17:24:39 +02:00
Behdad Esfahbod d4480ace7f [Indic] Improve matra vs consonant ordering
Another 1.5% down.
2012-05-09 15:59:47 +02:00
Behdad Esfahbod 33c92e7695 [Indic] Categorize Anudatta 2012-05-09 15:41:51 +02:00
Behdad Esfahbod 3943293a99 [Indic] Add joiner test cases for Devanagari 2012-05-09 15:27:56 +02:00
Behdad Esfahbod 19d984edaa [Indic] Make sure Reph jumps over all matras to the right
Another 12 thousand failures gone! (78 to go)
2012-05-09 15:21:13 +02:00
Behdad Esfahbod 9034641333 [Indic] Keep Vedic signs at the right too 2012-05-09 15:04:58 +02:00
Behdad Esfahbod d1deaa2f5b Replace zerowidth invisible chars with a zero-advance space glyph
Like Uniscribe does.
2012-05-09 15:04:13 +02:00
Behdad Esfahbod 49e5da1591 [indic] Keep the syllable modifier marks to the right
Shaping failures on Hindi Wikipedia go down from 25% to 14%!
2012-05-09 13:23:27 +02:00
Behdad Esfahbod 5b12609093 Minor 2012-05-09 12:37:27 +02:00
Behdad Esfahbod 9ce939232b Minor 2012-05-09 12:03:09 +02:00
Behdad Esfahbod 76b3409de6 [indic] Better Reph matching 2012-05-09 11:52:32 +02:00
Behdad Esfahbod df6d45c693 Minor 2012-05-09 11:38:31 +02:00
Behdad Esfahbod 412b91889d [indic] Apply Indic features in order 2012-05-09 11:07:18 +02:00
Behdad Esfahbod 1ac075b227 [indic] Apply rakaar forms
Fixes 10% of the failures against all of Hindi Wikipedia!
2012-05-09 11:06:47 +02:00
Behdad Esfahbod 2214a03900 Add hb-diff-ngrams 2012-05-09 09:54:54 +02:00
Behdad Esfahbod 178e6dce01 Add N-gram generator 2012-05-09 08:57:29 +02:00
Behdad Esfahbod 98669ceb77 Use groupby() 2012-05-09 08:16:15 +02:00
Behdad Esfahbod c438a14b62 Add hb-diff-stat 2012-05-09 07:45:17 +02:00
Behdad Esfahbod 1058d031e2 Make hb-diff-filter-failtures retain all test info for failed tests 2012-05-09 07:35:28 +02:00
Behdad Esfahbod f1eb008cc7 Add hb-diff-colorize
Accepts --format=html now.
2012-05-09 00:01:50 +02:00
Behdad Esfahbod 9155e4ffe0 Cleanup diff
Doesn't do --color anymore.  That will go into a new hb-diff-colorize
tool.
2012-05-08 22:44:21 +02:00
Behdad Esfahbod 7d22135b4c Make hb-diff faster 2012-05-08 19:38:49 +02:00
Behdad Esfahbod a93e238e05 More tests 2012-05-08 18:55:29 +02:00
Behdad Esfahbod 1a2a4a0078 Fix warning and build issues
As reported by Jonathan Kew on the list.
2012-05-05 22:38:20 +02:00
Behdad Esfahbod a5e39fed85 Minor 2012-04-25 00:14:46 -04:00
Behdad Esfahbod 1827dc208c Add hb_ot_shape_glyphs_closure()
Experimental API for now.
2012-04-24 16:56:37 -04:00
Behdad Esfahbod bb09f0ec10 Minor 2012-04-24 16:02:12 -04:00
Behdad Esfahbod 29a7e306e3 Minor 2012-04-24 16:01:30 -04:00