harfbuzz

Commit Graph

Author	SHA1	Message	Date
Behdad Esfahbod	5d874d566f	[GPOS] Fix mark-to-mark positioning when one of the marks is a ligature This commit: `a3313e5400` broke MarkMarkPos when one of the marks itself is a ligature. That regressed 26 Tibetan tests (up from zero!). Fix that. Tibetan back to zero.	2012-07-28 21:05:25 -04:00
Behdad Esfahbod	6411e74caf	[Indic] Reposition Gurmukhi top matras to after post The font is forming a post-base consonant in some samples, and Uniscribe positions top matra on the post-base. Do the same. Gurmukhi failures down from 59 to 41 (0.0674242%).	2012-07-24 13:48:49 -04:00
Behdad Esfahbod	c3f769ba09	[Indic] Ignore Uniscribe output containing two zero-width space glyphs Uniscribe is buggy and sometimes /eats/ a mark next to a non-joiner. Most of Malayalam failures where actually hitting this bug. Ignore test output with two zero-width space glyphs. This is a hack until we build up the test suite infrastructure better. Bengali went down by 9, Devanagari by 2, Kannada by 130, Malayalm down from 1197 to 307, Sinhala down by 16, Telugu down by 26. New stats: BENGALI: 353996 out of 354285 tests passed. 289 failed (0.0815727%) DEVANAGARI: 693573 out of 693628 tests passed. 55 failed (0.00792932%) GUJARATI: 366489 out of 366506 tests passed. 17 failed (0.0046384%) GURMUKHI: 60750 out of 60809 tests passed. 59 failed (0.0970251%) KANNADA: 951086 out of 951913 tests passed. 827 failed (0.0868777%) KHMER: 299094 out of 299124 tests passed. 30 failed (0.0100293%) MALAYALAM: 1048109 out of 1048416 tests passed. 307 failed (0.0292823%) ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%) SINHALA: 271715 out of 271847 tests passed. 132 failed (0.0485567%) TAMIL: 1091837 out of 1091837 tests passed. 0 failed (0%) TELUGU: 970550 out of 970573 tests passed. 23 failed (0.00236973%)	2012-07-24 13:26:32 -04:00
Behdad Esfahbod	65c43accdc	[Indic] Better position left-matra in Malayalam Just put it before base, which is what's expected. Malayalam failures down from 1559 to 1197 (0.114172%). BENGALI: 353988 out of 354285 tests passed. 297 failed (0.0838308%) DEVANAGARI: 693571 out of 693628 tests passed. 57 failed (0.00821766%) GUJARATI: 366489 out of 366506 tests passed. 17 failed (0.0046384%) GURMUKHI: 60750 out of 60809 tests passed. 59 failed (0.0970251%) KANNADA: 950956 out of 951913 tests passed. 957 failed (0.100534%) KHMER: 299094 out of 299124 tests passed. 30 failed (0.0100293%) MALAYALAM: 1047219 out of 1048416 tests passed. 1197 failed (0.114172%) ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%) SINHALA: 271699 out of 271847 tests passed. 148 failed (0.0544424%) TAMIL: 1091837 out of 1091837 tests passed. 0 failed (0%) TELUGU: 970524 out of 970573 tests passed. 49 failed (0.00504856%)	2012-07-24 03:36:47 -04:00
Behdad Esfahbod	88f413b56f	[Indic] Implement Reph+Ya-Phalaa interaction The sequence Ra,H,Ya in Bengali is ambigious and Unicode encoded that to get Ya-Phalaa, one would place ZWJ before Halant. Ie. a ZWJ,H sequence requests subjoining, while a H,ZWJ requests Half form. Implement that. Bengali failures go down from 377 to 297 (0.0838308%). Gujarati is down by 4 to 17 (0.0046384%). Kannada is down by 226 to 957 (0.100534%). Current status: BENGALI: 353988 out of 354285 tests passed. 297 failed (0.0838308%) DEVANAGARI: 693571 out of 693628 tests passed. 57 failed (0.00821766%) GUJARATI: 366489 out of 366506 tests passed. 17 failed (0.0046384%) GURMUKHI: 60750 out of 60809 tests passed. 59 failed (0.0970251%) KANNADA: 950956 out of 951913 tests passed. 957 failed (0.100534%) KHMER: 299094 out of 299124 tests passed. 30 failed (0.0100293%) MALAYALAM: 1046857 out of 1048416 tests passed. 1559 failed (0.148701%) ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%) SINHALA: 271699 out of 271847 tests passed. 148 failed (0.0544424%) TAMIL: 1091837 out of 1091837 tests passed. 0 failed (0%) TELUGU: 970524 out of 970573 tests passed. 49 failed (0.00504856%)	2012-07-24 03:04:36 -04:00
Behdad Esfahbod	330b329c89	[Indic] Unmark U+17D1 KHMER SIGN VIRIAM to NOT be a Virama Fixes another 1 Khmer failure. Down to 30 (0.0100293%) now.	2012-07-24 02:25:26 -04:00
Behdad Esfahbod	d90b8e841e	[Indic] Reposition Khmer prebase-reordering Ra around split matras In Khmer coeng model, a V,Ra can go after matras. If it goes after a split matra, it should be reordered to before the left part of such matra. Khmer failures down from 136 to 39 (0.0130381%).	2012-07-24 02:11:18 -04:00
Behdad Esfahbod	7573799126	[Indic] Position Khmer U+17CE Fixes another 6 Khmer failures. Now at 136 (0.0454661%).	2012-07-24 01:32:07 -04:00
Behdad Esfahbod	2278eefcdb	[Indic] In Sinhala, form forced Reph even if no other consonant found Fixes another 10 Sinhala failures. Down to 148 (0.0544424%).	2012-07-24 00:31:10 -04:00
Behdad Esfahbod	71fd5e80ad	[Indic] Further adjust base algorithm for Sinhala Apparently if there is C,V,ZWJ,C, the first C will be base, but if it's C,ZWJ,V,C, the second one will be. Note that Uniscribe implements this differently, by breaking syllable in the case of C,ZWJ,V,C and putting the first consonant in one syllable and the rest in the next syllable. Sinhala failures down from 208 to 158 (0.0581209%). No changes to Khmer.	2012-07-24 00:21:16 -04:00
Behdad Esfahbod	73d71cc527	[Indic] End Vowel-based syllable at ZWJ One Devanagari test regressed, plus 10 Malayalam (at 1545 now). Fixed 120 Sinhala failures. Now at 208 (0.0765136%).	2012-07-24 00:09:12 -04:00
Behdad Esfahbod	34c215036f	[Indic] Improve Sinhala base algorithm and reph positioning Sinhala does not have half forms. And most (all?) consonants can be base, except when preceded by ZWJ, which would request a subjoined form. Hence switch the base algorithm to categorize with Khmer, start search at start, and stop at a ZWJ. Also, mark all pos=base consonants after base to be subjoined. Mark base itself to have pos=base. Finally, adjust Sinhala's reph position to after-main. Brings down Sinhala failures from 455 to 328 (0.120656%).	2012-07-23 23:51:29 -04:00
Behdad Esfahbod	771a8f5028	[Indic] exclude ligatures when matching on Indic category If, say, a H,ZWJ,C ligature was formed, we don't want the code to detec that as a Halant. So, ignore ligatures when matching category in final_reordering. Sinhala failures down from 514 to 455 (0.167374%).	2012-07-23 20:09:30 -04:00
Behdad Esfahbod	42848453bf	[Thai] Reorder U+0E3A THAI VOWEL SIGN PHINTHU Uniscribe reorders U+0E3A to be after U+0E38 and U+0E39. We do that by modifying the ccc for U+0E3A. Fixes the two remaining Thai failures (see previous commit).	2012-07-23 13:52:07 -04:00
Behdad Esfahbod	4a7f4f3e56	[Thai] Adjust SARA AM reordering to match Uniscribe Adjust the list of marks before SARA AM that get the reordering treatment. Also adjust cluster formation to match Uniscribe. With Wikipedia test data, now I see: - For Thai, with the Angsana New font from Win7, I see 54 failures out of over 4M tests (0.00129107%). Of the 54, two are legitimate reordering issues (fix coming soon), and the other 52 are simply Uniscribe using a zero-width space char instead of an unknown character for missing glyphs. No idea why. The missing-glyph sequences include one that is a Thai character followed by an Arabic Sokun. Someone confused it with Nikhahit I assume! - For Lao, with the Dokchampa font from Win7, 33 tests fail out of 54k (0.0615167%). All seem to be insignificant mark positioning with two marks on a base. Have to investigate.	2012-07-23 13:15:33 -04:00
Behdad Esfahbod	60554f14d8	[Indic] Merge in Malayalam tests From: http://silpa.org.in/pub/tests/hb/ml/ml-harfbuzz-testdata.txt	2012-07-22 23:23:56 -04:00
Behdad Esfahbod	5c7081770c	[Indic] Add extensive Sinhala tests Generated by: http://git.savannah.gnu.org/cgit/sinhala.git/plain/utils/gen-unicode-sinhala.py	2012-07-22 23:20:27 -04:00
Behdad Esfahbod	2efe4707b1	[Indic] Add Sinhala tests Merge tests from: http://git.savannah.gnu.org/cgit/sinhala.git/plain/patches/icu-sinhala-rendering.txt	2012-07-22 23:17:59 -04:00
Behdad Esfahbod	3d4c111b7a	Add a test case	2012-07-20 19:34:39 -04:00
Behdad Esfahbod	bdd080431a	[Indic] Reposition Oriya Candrabindu Oriya failures down from 0.65% to 0.20%.	2012-07-20 16:03:09 -04:00
Behdad Esfahbod	87cd63266e	[Indic] Recategorize some Kannada right matras Kannada failures down from 3.5% to 2.93%.	2012-07-19 21:25:46 -04:00
Behdad Esfahbod	c87bcddb10	[Indic] Add failing test for Kannada	2012-07-19 20:03:25 -04:00
Behdad Esfahbod	deeb540a74	[test] Ignore tests with DOTTED CIRCLE in the output	2012-07-19 11:30:48 -04:00
Behdad Esfahbod	422ecd2d3c	[Indic] Accept a forced Rakar sequence at the end of syllable In Sinhala, Rakar is formed by Al-Lakuna,ZWJ,Ra. If you put that at the end of a Consonant,Matra syllable, you get a dotted-circle from Uniscribe. Apparently adding a ZWJ before the Al-Lakuna "fixes" that. And people have been encoding that sequence... So, allow a forced "ZWJ,Virama,ZWJ,Ra" sequence at the of syllables. Fixes some 100 or more of Sinhala failures. Now at 622 only (0.23%).	2012-07-18 23:25:58 -04:00
Behdad Esfahbod	10cdc94eee	[Indic] In final reordering, find base, even if it disappeared POS_BASE can disappear if base ligated backward. Define base as last with position not after base. Fixes a few hundred of Sinhala failures with Iskoola Pota.	2012-07-18 17:43:23 -04:00
Behdad Esfahbod	3285e107c9	[Indic] Implement Sinhala "Al Lakuna" Reph behavior In Sinhala, Reph is formed only explicitly, by the presence of a ZWJ.	2012-07-18 17:22:14 -04:00
Behdad Esfahbod	552d19b7a1	[Indic] Treat Register Shifters like Nukta Really this time. Fixes another 18 Khmer tests.	2012-07-18 16:02:33 -04:00
Behdad Esfahbod	69f26bf39c	[Indic] Fix Matra reordering when base is at end of syllable For example: U+915,U+200c,U+93f Fixes last Tamil failure!	2012-07-18 15:47:51 -04:00
Behdad Esfahbod	391cc03317	[Indic] Allow halant group in Vowel and placeholder syllables Fixes 2 out of 560 Devanagari failures. AND: Fixes 1 out of 2 Tamil failures.	2012-07-18 15:12:49 -04:00
Behdad Esfahbod	418d00dffd	[Indic] Minor	2012-07-18 14:57:28 -04:00
Behdad Esfahbod	25bc489498	[Indic] Better categorize Register Shifters and Khmer Various signs Down another 500 or so Khmer failures!	2012-07-17 17:53:03 -04:00
Behdad Esfahbod	34b5714906	[Indic] Treat Khmer Register Shifters more like Nuktas Except that there may be a ZWNJ before a Register Shifter.	2012-07-17 14:09:32 -04:00
Behdad Esfahbod	0201e0a464	[Indic] Apply 'cfar' for Khmer Mark stuff after a pre-base reordering Ro 'cfar'. Used in Khmer. This allows distinguishing the following cases with MS Khmer fonts: U+1784,U+17D2,U+179A,U+17D2,U+1782 U+1784,U+17D2,U+1782,U+17D2,U+179A	2012-07-17 13:56:24 -04:00
Behdad Esfahbod	55f70ebfb9	[Indic] Position final subjoined consonants (and vowels) after matras In Khmer, a final subjoined consonant or independent vowel can occur after matras. This final subjoined thing should NOT be reordered to before the matra even though it's subjoined. Fixes another 1k of the Khmer failures. Not much left really.	2012-07-17 12:50:13 -04:00
Behdad Esfahbod	c50ed71e9a	[Indic] Recategorize Khmer coeng sign as a separate category OT_Coeng Amend the syllable structure to allow a final subscripted consonant (Coeng+C) and a final subscripted independent vowel (Coeng+V). Fixes another 2k of Khmer failures.	2012-07-17 11:54:28 -04:00
Behdad Esfahbod	74ccc6a132	[Indic] Move Halant with after-base consonants Normally, we attach the Halant to the previous character and move it with it. For after-base consonants however, the Halant "belongs" to the consonant after, so attach it so. This fixes Bengali sequences involving post-base consonant Ya, which should ligate with the Halant to form Ya Phala, but previously a reordered matras was blocking the ligation.	2012-07-17 11:16:19 -04:00
Behdad Esfahbod	d5c4edcdd6	[Indic] Apply presentation-forms features all at once Seems like this is what Uniscribe is doing, and does not break any fonts we tested (with Devanagari, Malayalam, Khmer, and Bengali), while fixing some Ra Phala sequences for Bengali with Vrinda. Fixes another 2% of Bengali failures (a couple more to go).	2012-07-17 10:40:59 -04:00
Behdad Esfahbod	6de103547e	[test/arabic] Add Arabic tests for mark skipping Expose a bug with Khaled's Hussaini Nastaleeq font.	2012-07-16 22:46:52 -04:00
Behdad Esfahbod	1167c7bfc9	Minor	2012-07-11 18:00:28 -04:00
Behdad Esfahbod	aa116582e6	Minor	2012-07-11 18:00:28 -04:00
Behdad Esfahbod	b0a6e58bb3	s/script-punjabi/script-gurmukhi/	2012-06-04 10:21:22 -04:00
Behdad Esfahbod	4efdffec09	Minor Malayalam test case From https://bugs.freedesktop.org/show_bug.cgi?id=45166	2012-05-28 10:45:50 -04:00
Behdad Esfahbod	dfff5b3021	Add Myanmar test case	2012-05-28 10:45:50 -04:00
Behdad Esfahbod	ff3524c21a	Add Arabic diacritics tests	2012-05-23 21:50:43 -04:00
Behdad Esfahbod	a6de53664d	Add CJK Compatibility Ideographs tests From: http://people.mozilla.org/~jdaggett/tests/cjkcompat.html	2012-05-18 15:04:35 -04:00
Behdad Esfahbod	f538fcb538	[test] Make tool usage easier by not requiring "--stdin" Just default to it. Added "--help" instead to get usage.	2012-05-12 15:34:40 +02:00
Behdad Esfahbod	a3273e30bb	[Indic] Add more Malayalam tests	2012-05-12 13:34:18 +02:00
Behdad Esfahbod	5b16de97bc	[Indic] Add tests for dottedcircle	2012-05-11 19:55:42 +02:00
Behdad Esfahbod	c071b99f15	[Indic] Add test for Left Matra with Halant Uniscribe doesn't move the Halant, we do. And do a broken job of it now.	2012-05-11 16:22:46 +02:00
Behdad Esfahbod	b20c9ebaf5	[Indic] Add test for matra group The spec says: "[{M}+[N]+[H]]", and that's what Uniscribe implements. We instead do: "{M+[N]+[H]}", which means we allow Nukta and Halant after all Matras, not just the last one. It makes more sense.	2012-05-10 18:31:17 +02:00
Behdad Esfahbod	61a58e26a5	[Indic] Add tricky reordering test cases In the case of Consonant,LeftMatra,Halant, Uniscribe leaves the Halant where it is, but we want to move it with the Matra as that makes more logical sense.	2012-05-10 14:43:53 +02:00
Behdad Esfahbod	3943293a99	[Indic] Add joiner test cases for Devanagari	2012-05-09 15:27:56 +02:00
Behdad Esfahbod	2214a03900	Add hb-diff-ngrams	2012-05-09 09:54:54 +02:00
Behdad Esfahbod	178e6dce01	Add N-gram generator	2012-05-09 08:57:29 +02:00
Behdad Esfahbod	98669ceb77	Use groupby()	2012-05-09 08:16:15 +02:00
Behdad Esfahbod	c438a14b62	Add hb-diff-stat	2012-05-09 07:45:17 +02:00
Behdad Esfahbod	1058d031e2	Make hb-diff-filter-failtures retain all test info for failed tests	2012-05-09 07:35:28 +02:00
Behdad Esfahbod	f1eb008cc7	Add hb-diff-colorize Accepts --format=html now.	2012-05-09 00:01:50 +02:00
Behdad Esfahbod	9155e4ffe0	Cleanup diff Doesn't do --color anymore. That will go into a new hb-diff-colorize tool.	2012-05-08 22:44:21 +02:00
Behdad Esfahbod	7d22135b4c	Make hb-diff faster	2012-05-08 19:38:49 +02:00
Behdad Esfahbod	a93e238e05	More tests	2012-05-08 18:55:29 +02:00
Behdad Esfahbod	585b107cde	Add test caes for a minority language using Bengali U+0985 BENGALI LETTER A followed by U+09D7 BENGALI AU LENGTH MARK. According to Bobby de Vos on the mailing list, this results in a dotted circle with most shaping engines, but is a legitimate sequence in this minority language. We reached the consensus on the list to NOT implement dotted-circle in HarfBuzz.	2012-04-24 16:00:50 -04:00
Behdad Esfahbod	0290bbf861	Add another Thai test	2012-04-17 10:28:21 -04:00
Behdad Esfahbod	4d85252bda	Add Japanese test data from Adobe's Kazuraki font ligatures	2012-04-16 15:54:26 -04:00
Behdad Esfahbod	f9746b600a	Minor	2012-04-12 09:59:26 -04:00
Behdad Esfahbod	7470b0ff80	Add Mongolian test case	2012-04-12 09:44:27 -04:00
Behdad Esfahbod	a4976447cd	Add Hangul test	2012-04-11 17:48:40 -04:00
Behdad Esfahbod	e95d912b3b	Fix diff tool	2012-04-11 17:33:02 -04:00
Behdad Esfahbod	e099dd6592	Add Thai test case for SARA AM decomposition	2012-04-10 10:47:33 -04:00
Behdad Esfahbod	4450dc9354	Move around	2012-04-07 22:07:23 -04:00
Behdad Esfahbod	aaa25d5f45	Add Hangul test case Composed, and decomposed, of the same text.	2012-04-05 17:27:23 -04:00
Behdad Esfahbod	406044986a	Add Hebrew diacritics test cases From: https://bugzilla.mozilla.org/show_bug.cgi?id=662055	2012-03-06 20:24:31 -05:00
Behdad Esfahbod	7a70ca78e0	Add test case from https://bugzilla.mozilla.org/show_bug.cgi?id=714067	2012-02-21 11:31:47 -05:00
Behdad Esfahbod	1a5a91dc0d	Add a few more tests	2012-01-22 19:58:23 -05:00
Behdad Esfahbod	1795f3a222	Add a couple Thai test cases from Thep	2012-01-22 19:29:45 -05:00
Behdad Esfahbod	ec3f506682	Add Devanagari test from Tom Hacohen	2012-01-22 19:10:55 -05:00
Behdad Esfahbod	71be4ca3dd	Also ignore "ChangeLog" in manifests	2012-01-22 16:26:49 -05:00
Behdad Esfahbod	3c9a39ecd6	Remove newline	2012-01-22 16:21:19 -05:00
Behdad Esfahbod	e4ccbfe276	Allow --color=html in hb-diff Not that useful right now as we don't escape < and >. Perhaps another tool can be added to convert the ANSI output to HTML.	2012-01-22 16:07:32 -05:00
Behdad Esfahbod	8f80f93491	More shoveling around	2012-01-21 20:03:25 -05:00
Behdad Esfahbod	c78c6e9844	Cleanup	2012-01-21 19:55:16 -05:00
Behdad Esfahbod	ab94a9c542	Distribute testing tools	2012-01-21 19:43:58 -05:00
Behdad Esfahbod	3e86feb54c	Speed up colorless diff	2012-01-21 19:40:30 -05:00
Behdad Esfahbod	1e58df6034	Cleanup manifest code	2012-01-21 19:37:31 -05:00
Behdad Esfahbod	956d552e10	Port hb-manifest-update to Python	2012-01-21 19:31:51 -05:00
Behdad Esfahbod	3a34e9e351	Ignore Broken Pipe errors	2012-01-21 19:15:41 -05:00
Behdad Esfahbod	f22089ac24	Misc fixes	2012-01-20 21:22:14 -05:00
Behdad Esfahbod	96968bfae5	Port hb-manifest-read to Python	2012-01-20 21:16:34 -05:00
Behdad Esfahbod	a59ed46fa4	Add final residues from test-shape-complex	2012-01-20 20:56:32 -05:00
Behdad Esfahbod	820e0ed318	Add Punjabi tests from test-shape-complex also	2012-01-20 20:51:52 -05:00
Behdad Esfahbod	a7d71c1057	Add Tamil test data from Muguntharaj Subramanian	2012-01-20 20:50:09 -05:00
Behdad Esfahbod	5992a9941e	Import test data from late test-shape-complex	2012-01-20 20:48:14 -05:00
Behdad Esfahbod	46ac456477	Fix Unicode encoding issue	2012-01-20 19:32:17 -05:00
Behdad Esfahbod	ad34e39a4a	Make test tools interactive By bypassing readlines() buffering.	2012-01-20 18:40:25 -05:00
Behdad Esfahbod	91540a7d97	Move most testing logic into hb_test_tools.py The actual utils are one-liners now.	2012-01-20 18:28:10 -05:00
Behdad Esfahbod	66aa080033	Remove test-shape-complex New shaping testsuite and framework coming.	2012-01-20 17:36:10 -05:00
Behdad Esfahbod	ed459bfb63	Add hb-unicode-encode	2012-01-20 17:24:05 -05:00
Behdad Esfahbod	b12c4d4361	Add hb-diff-filter-failures	2012-01-20 17:17:44 -05:00
Behdad Esfahbod	d4bffbc55b	Move	2012-01-20 17:16:35 -05:00
Behdad Esfahbod	45f640c98d	Minor	2012-01-20 14:24:21 -05:00
Behdad Esfahbod	47ca766a9c	Minor	2012-01-20 14:21:53 -05:00
Behdad Esfahbod	8f1db07894	[test/shaping] Add some Indic test data for the new test suite Imported from UTRRS.	2012-01-20 14:00:44 -05:00
Behdad Esfahbod	11267aef36	Fix	2012-01-20 13:57:14 -05:00
Behdad Esfahbod	4e84ce48d5	Move hb-diff to test/shaping/	2012-01-20 13:51:22 -05:00
Behdad Esfahbod	f868e1b84d	Add hb-unicode-decode	2012-01-20 13:50:05 -05:00
Behdad Esfahbod	9ab23ef474	Minor	2012-01-20 13:49:56 -05:00
Behdad Esfahbod	c8d81db033	Recognize more characters	2012-01-20 13:39:27 -05:00
Behdad Esfahbod	0016d4662d	[test] Make hb-unicode-prettyname take a --stdin option	2012-01-20 13:31:59 -05:00
Behdad Esfahbod	ad8c6446f2	[test/shaping] Add hb-unicode-prettyname	2012-01-20 13:27:40 -05:00
Behdad Esfahbod	e900869b0f	[test/shaping] Add hb-read-manifest	2012-01-19 20:28:15 -05:00
Behdad Esfahbod	a211cd3ffc	Ignore AUTHORS also	2012-01-19 20:27:53 -05:00
Behdad Esfahbod	a33e46cf7d	[test/shaping] Add hb-update-manifests	2012-01-19 15:44:55 -05:00
Behdad Esfahbod	d4de562adf	Start adding new shaping test suite together	2012-01-19 15:21:04 -05:00

... 7 8 9 10 11

513 Commits