harfbuzz

Commit Graph

Author	SHA1	Message	Date
Behdad Esfahbod	73d71cc527	[Indic] End Vowel-based syllable at ZWJ One Devanagari test regressed, plus 10 Malayalam (at 1545 now). Fixed 120 Sinhala failures. Now at 208 (0.0765136%).	2012-07-24 00:09:12 -04:00
Behdad Esfahbod	9fa052733e	[Indic] Limit syllables to at most five consonants Seems to be about what Uniscribe does. Not exactly. But close enough. More consonants will start a new cluster. A few scripts went way down in failures. In particular: - Devanagari failures went down from 490 to 56. - Telugu went down from 113 to 49. Other scripts went down slightly or didn't change. New numbers: BENGALI: 353908 out of 354285 tests passed. 377 failed (0.106412%) DEVANAGARI: 693572 out of 693628 tests passed. 56 failed (0.00807349%) GUJARATI: 366485 out of 366506 tests passed. 21 failed (0.00572978%) GURMUKHI: 60750 out of 60809 tests passed. 59 failed (0.0970251%) KANNADA: 950730 out of 951913 tests passed. 1183 failed (0.124276%) KHMER: 298613 out of 299124 tests passed. 511 failed (0.170832%) MALAYALAM: 1046881 out of 1048416 tests passed. 1535 failed (0.146411%) ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%) SINHALA: 271333 out of 271847 tests passed. 514 failed (0.189077%) TAMIL: 1091837 out of 1091837 tests passed. 0 failed (0%) TELUGU: 970524 out of 970573 tests passed. 49 failed (0.00504856%) Some of the remaining Telugu and Devanagari issues seem to be Uniscribe eating Anusvara when placed before a non-joiner. Ouch!	2012-07-23 18:19:17 -04:00
Behdad Esfahbod	5791f32915	[Indic] Allow a ZWNJ after SM's Malayalam failures go way down. Other scripts benefitted slightly too. Sinhala had one or two test regressions, but...	2012-07-20 16:26:55 -04:00
Behdad Esfahbod	9e4f94a72c	[Indic] Break syllables at Halant,ZWNJ That's really what Uniscribe does, and explains a lot of pecularities of Halant,ZWNJ before the base. Sent Telugu from 1% failures to 0.03%. Improved Kannada and Malayalam slightly. Fixed half of Bengali, and did NOT break anything!	2012-07-20 13:48:03 -04:00
Behdad Esfahbod	422ecd2d3c	[Indic] Accept a forced Rakar sequence at the end of syllable In Sinhala, Rakar is formed by Al-Lakuna,ZWJ,Ra. If you put that at the end of a Consonant,Matra syllable, you get a dotted-circle from Uniscribe. Apparently adding a ZWJ before the Al-Lakuna "fixes" that. And people have been encoding that sequence... So, allow a forced "ZWJ,Virama,ZWJ,Ra" sequence at the of syllables. Fixes some 100 or more of Sinhala failures. Now at 622 only (0.23%).	2012-07-18 23:25:58 -04:00
Behdad Esfahbod	6fc1732003	[Indic] Allow joiners on both sides of Halant at the same time The sequence <ZWJ,Al-Lakuna,ZWJ> is used in Sinhala to explicitly ask for Rakar. Fixes two-thousand Sinhala tests. Not many left.	2012-07-18 17:49:19 -04:00
Behdad Esfahbod	552d19b7a1	[Indic] Treat Register Shifters like Nukta Really this time. Fixes another 18 Khmer tests.	2012-07-18 16:02:33 -04:00
Behdad Esfahbod	dcb527242b	[Indic] Allow joiners before matras Fixes 1 more Devanagari test!	2012-07-18 15:32:26 -04:00
Behdad Esfahbod	391cc03317	[Indic] Allow halant group in Vowel and placeholder syllables Fixes 2 out of 560 Devanagari failures. AND: Fixes 1 out of 2 Tamil failures.	2012-07-18 15:12:49 -04:00
Behdad Esfahbod	ca4e3d3eab	[Indic] Streamline halant/joiner in grammar	2012-07-18 15:05:40 -04:00
Behdad Esfahbod	418d00dffd	[Indic] Minor	2012-07-18 14:57:28 -04:00
Behdad Esfahbod	4c3691d2a3	[Indic] Hopefully minor! Refactoring Indic machin. No semantic change.	2012-07-18 14:23:55 -04:00
Behdad Esfahbod	db8981f1e0	[Indic] Position Khmer Robat It's a visual Repha. Still not positioning logical Repha as occurs in Malayalam. Another 200 Khmer failures fixed. 547 to go. That's better than Devanagari!	2012-07-17 23:42:04 -04:00
Behdad Esfahbod	25bc489498	[Indic] Better categorize Register Shifters and Khmer Various signs Down another 500 or so Khmer failures!	2012-07-17 17:53:03 -04:00
Behdad Esfahbod	34b5714906	[Indic] Treat Khmer Register Shifters more like Nuktas Except that there may be a ZWNJ before a Register Shifter.	2012-07-17 14:09:32 -04:00
Behdad Esfahbod	11e2a601b1	[Indic] Minor	2012-07-17 14:02:28 -04:00
Behdad Esfahbod	c50ed71e9a	[Indic] Recategorize Khmer coeng sign as a separate category OT_Coeng Amend the syllable structure to allow a final subscripted consonant (Coeng+C) and a final subscripted independent vowel (Coeng+V). Fixes another 2k of Khmer failures.	2012-07-17 11:54:28 -04:00
Behdad Esfahbod	deb521dee4	[Indic] Add a separate Coeng class No characters recategorized yet. No semantic change.	2012-07-17 11:37:32 -04:00
Behdad Esfahbod	7d09c98a1f	[Indic] Recognizer Register Shifter marks Fixes another 6% of the Khmer failures.	2012-07-16 16:45:22 -04:00
Behdad Esfahbod	a98d0ab186	Make sure HB_BEGIN_DECLS / HB_END_DECLS is only used in public headers So we can use them to switch default visibility to internal if desired, and use these to make only declared symbols public.	2012-07-13 10:19:10 -04:00
Behdad Esfahbod	27aba594c9	Minor	2012-05-24 15:00:01 -04:00
Behdad Esfahbod	18c06e189b	[Indic] Add Uniscribe bug feature for dotted circle For dotted-circle independent clusters, Uniscribe does no Reph shaping for the exact sequence Ra+Halant+25CC. Which also is the only possible sequence with 25CC at the end.	2012-05-11 20:02:14 +02:00
Behdad Esfahbod	9c09928989	[Indic] Allow multiple Consonants in Vowel/NBSP syllables Uniscribe allows multiple Halant+Consonant after a Vowel. Tests: ↦ * U+0905,U+094D,U+092B,U+094D,930,94d,930	2012-05-11 18:46:35 +02:00
Behdad Esfahbod	8c0aa486f3	[Indic] Allow two Nuktas per consonant Uniscribe allows up to two nuktas per consonant and one per matra. It does so indepent of whether the consonant already has a nukta in it. Tests: * U+0916,U+093C,U+0941 * U+0959,U+093C,U+0941 * U+0916,U+093C,U+093C,U+0941 * U+0959,U+093C,U+093C,U+0941 * U+0916,U+093C,U+093C,U+093C,U+0941 * U+0959,U+093C,U+093C,U+093C,U+0941 * 915,93c,93c,,94d,U+0916,U+093C,U+093C,U+093e,93c,93c	2012-05-11 18:13:42 +02:00
Behdad Esfahbod	3399a06e70	[Indic] Fix U+0952 and similar classification to match Uniscribe See comments.	2012-05-11 17:54:26 +02:00
Behdad Esfahbod	ff24d1081a	[Indic] Don't use syllable serial value 0	2012-05-11 17:07:08 +02:00
Behdad Esfahbod	4be46bade2	[Indic] Fix state machine to backtrack	2012-05-11 14:39:01 +02:00
Behdad Esfahbod	cee7187447	[Indic] Move syllable tracking from Indic to generic layer This is to incorporate it into GSUB/GPOS processing.	2012-05-11 11:41:39 +02:00
Behdad Esfahbod	86e5dd386a	[Indic] Don't give up syllable parsing upon junk	2012-05-09 18:57:37 +02:00
Behdad Esfahbod	ef24cc8c8e	[Indic] Towards multi-cluster syllables and final reordering	2012-05-09 18:10:20 +02:00
Behdad Esfahbod	9ceca3aeb1	Fix ragel regexp in vowel-based syllable As reported by datao zhang on the mailing list.	2012-04-16 21:05:51 -04:00
Behdad Esfahbod	b870afcd1b	Rewrite ragel expression to better match the one on MS spec https://www.microsoft.com/typography/otfntdev/devanot/shaping.aspx	2012-04-16 21:05:11 -04:00
Behdad Esfahbod	d4cc44716c	Move code around, in prep for Thai/Lao shaper	2012-04-07 21:52:28 -04:00
Behdad Esfahbod	461b9b6347	Fix cluster formation in Indic Makes number of failures against Uniscribe with hi_IN dictionary from OO.o to go down from 6334 to 4290. Not bad for a one-line change! Mozilla Bug 729626 - ASAN: heap-buffer-overflow HTML	2012-03-01 18:11:19 -08:00
Behdad Esfahbod	743807a3ce	[Indic] Apply Indic features Find the base consonant and apply basic Indic features accordingly. Nothing complete, but does something for now. Specifically: no Ra handling right now, and no ZWJ/ZWNJ. Number of failing shape-complex tests goes from 174 down to 125. Next: reorder matras.	2011-07-29 16:46:09 -04:00
Behdad Esfahbod	76f76812ac	Shuffle code around, remove shape_plan from complex shapers	2011-07-07 22:25:25 -04:00
Behdad Esfahbod	d69d5ceaa0	[Indic] Well, at least finding syllables works now :) Still not much there.	2011-07-04 12:56:38 -04:00
Behdad Esfahbod	c7fe56a1d5	[Indic] Some of the basic features are global; Mark them so	2011-06-24 19:05:34 -04:00
Behdad Esfahbod	867361c3ad	[indic] Add syllable recognition state machine Using an incredible tool called Ragel.	2011-06-17 18:35:46 -04:00

1 2

89 Commits