harfbuzz

Commit Graph

Author	SHA1	Message	Date
Behdad Esfahbod	34c215036f	[Indic] Improve Sinhala base algorithm and reph positioning Sinhala does not have half forms. And most (all?) consonants can be base, except when preceded by ZWJ, which would request a subjoined form. Hence switch the base algorithm to categorize with Khmer, start search at start, and stop at a ZWJ. Also, mark all pos=base consonants after base to be subjoined. Mark base itself to have pos=base. Finally, adjust Sinhala's reph position to after-main. Brings down Sinhala failures from 455 to 328 (0.120656%).	2012-07-23 23:51:29 -04:00
Behdad Esfahbod	49c5ec5144	Minor refactoring	2012-07-23 20:14:13 -04:00
Behdad Esfahbod	c3e6fdc379	[Indic] Improve check on ligatures Only skip actual ligatures, not marks in-between ligature components.	2012-07-23 20:11:42 -04:00
Behdad Esfahbod	771a8f5028	[Indic] exclude ligatures when matching on Indic category If, say, a H,ZWJ,C ligature was formed, we don't want the code to detec that as a Halant. So, ignore ligatures when matching category in final_reordering. Sinhala failures down from 514 to 455 (0.167374%).	2012-07-23 20:09:30 -04:00
Behdad Esfahbod	baacd090df	[Indic] Minor refactoring	2012-07-23 19:51:48 -04:00
Behdad Esfahbod	c7c4de2fb9	[Indic] Remove syllable length check before sorting We now limit syllable lengths in the machine. No need to match here.	2012-07-23 18:25:02 -04:00
Behdad Esfahbod	2cc933aff9	[Indic] Fix cluster formation with left-matras and conjunct forms Test case was: <U+0D15,U+0D4D,U+0D15,U+0D4A>.	2012-07-23 08:23:44 -04:00
Behdad Esfahbod	e6b01a878c	[Indic] Further streamline cluster formation This should address all possible cluster misformations that I had in mind.	2012-07-23 00:11:26 -04:00
Behdad Esfahbod	7b2a7dadd6	[Indic] Merge clusters before sorting This should fix any instabilities in cluster formation that we were speculating may happen with surrounding syllables. Or most of it perhaps.	2012-07-22 23:58:55 -04:00
Behdad Esfahbod	abb3239ef9	[Indic] Update clusters for left-matra even if matra didn't move Fixes crashes reported with left matra under non-uniscribe-bug-compatibilty mode.	2012-07-22 23:55:19 -04:00
Behdad Esfahbod	92a1ad7bef	[Indic] Stop searching for base if a post form is found before below form Improves Bengali and Gurmukhi. Malayalam regressed a bit. We will deal with that later.	2012-07-20 18:55:15 -04:00
Behdad Esfahbod	4c450c703f	[Indic] Recompose Bengali Ya,Nukta This is a bunch of hacks for now. Improves Bengali a bit.	2012-07-20 18:13:04 -04:00
Behdad Esfahbod	34ae336f3f	[Indic] Improve Reph AfterMain positioning Fixes 20 out of 48 failing Oriya tests. Failure rate down to 0.066% now.	2012-07-20 16:17:28 -04:00
Behdad Esfahbod	bdd080431a	[Indic] Reposition Oriya Candrabindu Oriya failures down from 0.65% to 0.20%.	2012-07-20 16:03:09 -04:00
Behdad Esfahbod	5f0eaaad12	[Indic] Fix base search in final_reordering Fixes most Malayalam failures. Down from 1.6% to 0.38% now. Fixes a few more in other scripts too.	2012-07-20 15:47:24 -04:00
Behdad Esfahbod	81202bd860	[Indic] Don't attach SM/VD to other characters	2012-07-20 15:14:51 -04:00
Behdad Esfahbod	f31d97e44e	[Indic] Form Telugu Reph out of Ra,Virama,ZWJ Apparently this was approved in Feb 2012. No font yet.	2012-07-20 14:13:35 -04:00
Behdad Esfahbod	30c3d5e9fc	[Indic] Simplify Uniscribe cluster emulation Now that we break syllables on Halant,ZWNJ, this code can be simplified.	2012-07-20 13:56:32 -04:00
Behdad Esfahbod	decf6ffca4	[Indic] Minor!	2012-07-20 13:51:31 -04:00
Behdad Esfahbod	9e4f94a72c	[Indic] Break syllables at Halant,ZWNJ That's really what Uniscribe does, and explains a lot of pecularities of Halant,ZWNJ before the base. Sent Telugu from 1% failures to 0.03%. Improved Kannada and Malayalam slightly. Fixed half of Bengali, and did NOT break anything!	2012-07-20 13:48:03 -04:00
Behdad Esfahbod	2c372b80f6	[Indic] Better check for applying 'init' Specifically, don't apply 'init' if previous char is a joiner. Fixes some more of Bengali.	2012-07-20 13:37:48 -04:00
Behdad Esfahbod	8ed248de77	[Indic] Minor	2012-07-20 11:42:24 -04:00
Behdad Esfahbod	d0e68dbd0b	[Indic] Implement reph positioning step 5 Not tuned, just copied from step 2. Fixes another 0.5% of Kannada failures. 1% to go.	2012-07-20 11:25:41 -04:00
Behdad Esfahbod	a9e45c32e4	[Indic] Don't let ZWNJ at the end of syllable affect base search Fixes a few Devanagari, half of remaining Kannada failures, quarter for Telugu, and others slightly improved or unchanged.	2012-07-20 11:04:15 -04:00
Behdad Esfahbod	20b68e699f	[Indic] Apply 'cjct' globally Fixes 5 Devanagari failures, and no regressions.	2012-07-20 10:47:46 -04:00
Behdad Esfahbod	51e764de44	[Indic] Unbreak old scriptures Brings down failures with Lohit-Telugu from 57% to 1.40%.	2012-07-20 10:30:24 -04:00
Behdad Esfahbod	900cf3d449	Minor	2012-07-20 10:18:23 -04:00
Behdad Esfahbod	87cd63266e	[Indic] Recategorize some Kannada right matras Kannada failures down from 3.5% to 2.93%.	2012-07-19 21:25:46 -04:00
Behdad Esfahbod	3604d64ced	[Indic] Recategorize GURMUKHI ADDAK It's not in IndicSyllabicCategory.txt. Fixes most of Gurmukhi failures. Failures down from 7.7% to 0.222%!	2012-07-19 21:13:04 -04:00
Behdad Esfahbod	5249f3aee1	[Indic] Unbreak Khmer For Khmer, all consonants are subjoining. No need to look in the font. We were looking in the wrong order anyway.	2012-07-19 20:30:22 -04:00
Behdad Esfahbod	e0475345d5	[Indic] Apply 'akhn' globally Fixes 1.5% more failures for Telugu, 2% for Kannada. Breaks one test in Devanagari.	2012-07-19 20:24:14 -04:00
Behdad Esfahbod	fa247ebe52	[Indic] Better position U+0CD5 Fixes another 5% of Kannada failures.	2012-07-19 19:52:19 -04:00
Behdad Esfahbod	f055442716	[Indic] Lookup consonant position in the font Fixes most failures of Oriya, and improves others a bit.	2012-07-19 16:20:21 -04:00
Behdad Esfahbod	8c973ebf0f	[Indic] Implement per-script matra positioning Following what the spec says. Brings down Telugu failures from 40% to 3.75%, and Kannada failures from 44% to 10%. Does NOT affect other scripts' test results.	2012-07-19 13:25:08 -04:00
Behdad Esfahbod	8bb32458f9	[Indic] More refactoring	2012-07-19 13:04:44 -04:00
Behdad Esfahbod	9ccc6382ba	[Indic] Minor refactoring	2012-07-19 12:45:31 -04:00
Behdad Esfahbod	be8b9f5f71	[Indic] Start refactoring different matra positions per script	2012-07-19 12:11:12 -04:00
Behdad Esfahbod	10cdc94eee	[Indic] In final reordering, find base, even if it disappeared POS_BASE can disappear if base ligated backward. Define base as last with position not after base. Fixes a few hundred of Sinhala failures with Iskoola Pota.	2012-07-18 17:43:23 -04:00
Behdad Esfahbod	9c4d24a3a6	[Indic] Minor	2012-07-18 17:29:10 -04:00
Behdad Esfahbod	3285e107c9	[Indic] Implement Sinhala "Al Lakuna" Reph behavior In Sinhala, Reph is formed only explicitly, by the presence of a ZWJ.	2012-07-18 17:22:14 -04:00
Behdad Esfahbod	552d19b7a1	[Indic] Treat Register Shifters like Nukta Really this time. Fixes another 18 Khmer tests.	2012-07-18 16:02:33 -04:00
Behdad Esfahbod	e8cd81f76d	[Indic] Minor	2012-07-18 16:00:20 -04:00
Behdad Esfahbod	69f26bf39c	[Indic] Fix Matra reordering when base is at end of syllable For example: U+915,U+200c,U+93f Fixes last Tamil failure!	2012-07-18 15:47:51 -04:00
Behdad Esfahbod	075d671f10	[Indic] Fix out-of-bounds array access	2012-07-18 15:41:53 -04:00
Behdad Esfahbod	14dbdd9e39	[Indic] Unbreak Tamil Tamil has only about 150 failures now!	2012-07-18 13:13:03 -04:00
Behdad Esfahbod	db8981f1e0	[Indic] Position Khmer Robat It's a visual Repha. Still not positioning logical Repha as occurs in Malayalam. Another 200 Khmer failures fixed. 547 to go. That's better than Devanagari!	2012-07-17 23:42:04 -04:00
Behdad Esfahbod	25bc489498	[Indic] Better categorize Register Shifters and Khmer Various signs Down another 500 or so Khmer failures!	2012-07-17 17:53:03 -04:00
Behdad Esfahbod	25e302da9a	[Indic] Minor	2012-07-17 14:25:14 -04:00
Behdad Esfahbod	5d32690a34	[Indic] For scripts without Half forms, always choose first consonant as base In such scripts (ie. Khmer), a ZWJ/ZWNJ shouldn't stop the search for base. So, instead just choose the first consonant as base directly. Test sequence: U+1798,200c,U+17C9,U+17D2,U+179B,U+17C1,U+17C7	2012-07-17 14:23:28 -04:00
Behdad Esfahbod	0201e0a464	[Indic] Apply 'cfar' for Khmer Mark stuff after a pre-base reordering Ro 'cfar'. Used in Khmer. This allows distinguishing the following cases with MS Khmer fonts: U+1784,U+17D2,U+179A,U+17D2,U+1782 U+1784,U+17D2,U+1782,U+17D2,U+179A	2012-07-17 13:56:24 -04:00

1 2 3 4

175 Commits