harfbuzz

Commit Graph

Author	SHA1	Message	Date
Behdad Esfahbod	26c836e53d	[indic] Handle "Cantillation marks for the Samaveda"	2014-05-21 18:35:48 -04:00
Behdad Esfahbod	29531128f2	[indic] Improve reph formation of Sinhala and Telugu Sinhala and Telugu use "explicit" reph. That is, the reph is formed by a Ra,H,ZWJ sequence. Previously, upon detecting this sequence, we were checking checking whether the 'rphf' feature applies to the first two glyphs of the sequence. This is how the Microsoft fonts are designed. However, testing with Noto shows that apparently Uniscribe also forms the reph if the lookup ligates all three glyphs. So, try both sequences. Doesn't affect test results for Sinhala or Telugu. https://code.google.com/a/google.com/p/noto-alpha/issues/detail?id=232	2014-05-15 14:04:02 -06:00
Behdad Esfahbod	b082ef373c	Typo	2014-04-25 11:48:10 -07:00
Behdad Esfahbod	828e109c7a	[indic] Fix-up zero-context matching commit `b5a0f69e47` Author: Behdad Esfahbod <behdad@behdad.org> Date: Thu Oct 17 18:04:23 2013 +0200 [indic] Pass zero-context=false to would_substitute for newer scripts For scripts without an old/new spec distinction, use zero-context=false. This changes behavior in Sinhala / Khmer, but doesn't seem to regress. This will be useful and used in Javanese. The intention was to change zero-context from true to false for scripts that don't have old-vs-new specs. However, checking the code, looks like we essentially change zero-context to always be true; ie. we only changed things for old-spec, and we broke them. That's what causes this bug: https://bugs.freedesktop.org/show_bug.cgi?id=76705 The root of the bug is here: /* Use zero-context would_substitute() matching for new-spec of the main * Indic scripts, but not for old-spec or scripts with one spec only. */ bool zero_context = indic_plan->config->has_old_spec \|\| !indic_plan->is_old_spec; Note that is_old_spec itself is: indic_plan->is_old_spec = indic_plan->config->has_old_spec && ((plan->map.chosen_script[0] & 0x000000FF) != '2'); It's easy to show that zero_context is now always true. What we really meant was: bool zero_context = indic_plan->config->has_old_spec && !indic_plan->is_old_spec; Ie, "&&" instead of "\|\|". We made this change supposedly to make Javanese work. But apparently we got it working regardless! So I'm going to fix this to only change the logic for old-spec and not touch other cases.	2014-04-18 16:53:34 -07:00
Behdad Esfahbod	0682ddd05c	[indic] Support U+17DD KHMER SIGN ATTHACAN As requested by Martin Hosken on the list.	2014-04-08 16:03:35 -07:00
Behdad Esfahbod	3d6ca0d32e	[ot] Simplify normalization_preference again No shaper has more than one behavior re this, so no need for a callback.	2013-12-31 16:35:37 +08:00
Behdad Esfahbod	71b4c999a5	Revert "Zero marks by GDEF for Tibetan" This reverts commit `d5bd0590ae`. The reasoning behind that logic was flawed and made under a misunderstanding of the original problem, and caused regressions as reported by Jonathan Kew in thread titled "tibetan marks" in Oct 2013. Apparently I have had fixed the original problem with this commit: `7e08f1258d` So, revert the faulty commit and everything seems to be in good shape.	2013-10-28 00:43:27 +01:00
Behdad Esfahbod	46a863d91d	[indic] Adjust pref reordering logic For Javanese (pref_len == 1) only reorder if it didn't ligate. That's sensible, and what the spec says. For other Indic (pref_len > 1) only reorder if ligated. Doesn't change any test numbers.	2013-10-27 23:28:12 +01:00
Behdad Esfahbod	ddce2d8df6	[indic] Improve positioning of post-base bells and whistles Bug 58714 - Kannada u+0cb0 u+200d u+0ccd u+0c95 u+0cbe does not provide same results as Windows8 https://bugs.freedesktop.org/show_bug.cgi?id=58714 Test with U+0CB0,U+200D,U+0CCD,U+0C95,U+0CBF and tunga.ttf. Improves some scripts. Improves Bengali too, but numbers are up because we produce better results than Uniscribe for some sequences now. New numbers: BENGALI: 353724 out of 354188 tests passed. 464 failed (0.131004%) DEVANAGARI: 707307 out of 707394 tests passed. 87 failed (0.0122987%) GUJARATI: 366349 out of 366457 tests passed. 108 failed (0.0294714%) GURMUKHI: 60732 out of 60747 tests passed. 15 failed (0.0246926%) KANNADA: 951190 out of 951913 tests passed. 723 failed (0.0759523%) KHMER: 299070 out of 299124 tests passed. 54 failed (0.0180527%) MALAYALAM: 1048140 out of 1048334 tests passed. 194 failed (0.0185056%) ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%) SINHALA: 271662 out of 271847 tests passed. 185 failed (0.068053%) TAMIL: 1091753 out of 1091754 tests passed. 1 failed (9.15957e-05%) TELUGU: 970555 out of 970573 tests passed. 18 failed (0.00185457%)	2013-10-18 18:17:29 +02:00
Behdad Esfahbod	d5bd0590ae	Zero marks by GDEF for Tibetan See: http://lists.freedesktop.org/archives/harfbuzz/2013-April/003101.html	2013-10-18 18:17:29 +02:00
Behdad Esfahbod	c16012e901	[indic] Add Javanese support! Seems to be working just fine!	2013-10-18 18:17:29 +02:00
Behdad Esfahbod	9a49351cc2	[indic] Swith pref logic to use _hb_glyph_info_substituted() See comments from caveat! Seems to work fine. This is useful for Javanese which has an atomically encoded pre-base reordering Ra which should only be reordered if it was substituted by the pref feature.	2013-10-18 11:25:24 +02:00
Behdad Esfahbod	f175aa33c5	[indic] Fix compiler warnings	2013-10-18 11:25:24 +02:00
Behdad Esfahbod	a1f7b28561	[otlayout] Switch over from old is_a_ligature() to IS_LIGATED Impact should be minimal and positive.	2013-10-18 11:25:24 +02:00
Behdad Esfahbod	3ddf892b53	[otlayout] Renaming	2013-10-18 11:21:15 +02:00
Behdad Esfahbod	8f9ec92dfc	[indic] Adjust Javanese base algorithm	2013-10-17 19:52:47 +02:00
Behdad Esfahbod	74f4bbf056	[indic] Towards supporting atomicly-encoded prebase-reorderings	2013-10-17 19:07:53 +02:00
Behdad Esfahbod	efed40b975	[indic] Minor refactoring of reph handling	2013-10-17 18:50:11 +02:00
Behdad Esfahbod	684fe59ff8	[indic] Minor refactoring of would_substitute()	2013-10-17 18:30:06 +02:00
Behdad Esfahbod	b5a0f69e47	[indic] Pass zero-context=false to would_substitute for newer scripts For scripts without an old/new spec distinction, use zero-context=false. This changes behavior in Sinhala / Khmer, but doesn't seem to regress. This will be useful and used in Javanese.	2013-10-17 18:04:23 +02:00
Behdad Esfahbod	c4e71ff36d	[indic] Clean up Khmer and Sinhala base finding algorithm	2013-10-17 17:04:47 +02:00
Behdad Esfahbod	e10453e6fb	[indic] Add BASE_POS_LAST_SINHALA Previously we planted this into the mode used for Khmer. There's not really much in common between the two, so separate again.	2013-10-17 16:49:06 +02:00
Behdad Esfahbod	9ac6b01e0c	[indic] Adjust Sinhala cluster merging under uniscribe Similar to `190c8f2b60` but for Sinhala.	2013-10-17 16:27:38 +02:00
Behdad Esfahbod	6b2abdcd20	[indic] Improve clusters in presence of reph	2013-10-17 13:15:43 +02:00
Behdad Esfahbod	42d0f55cbc	[indic] Apply calt,clig in the same stage as presentation features Whic means these twp are applied per-syllable now. Apparently in some Khmer fonts the clig interacts with presentation features. Test case: U+1781,U+17D2,U+1789,U+17BB,U+17C6 with Mondulkiri-R.ttf should produce one big ligature.	2013-10-17 13:06:22 +02:00
Behdad Esfahbod	ae9a5834df	[indic] Fix pref vs blwf interaction If a glyph can be both blwf and pref, we were wrongly sorting it in the post position instead of below position.	2013-10-17 12:24:55 +02:00
Behdad Esfahbod	c7dacac02c	[indic] Don't apply blwf before base under old-spec mode Test case: U+09AC,U+09CD,U+09A6 with Lohit-Bengali 2.5.3.	2013-10-17 12:20:46 +02:00
Behdad Esfahbod	3756efaf4e	[indic] Misc harmless fixes! First, we were abusing OT_VD instead of OT_A. Fix that but moving OT_A in the grammar where it belongs (which is different from what the spec says). Also, only allow medial consonants after all other consonants. This doesn't affect any current character. Finally, fix Halant attachment in presence of medial consonants. Again, this currently doesn't affect any sequence. I lied. There's Gurmukhi U+0A75 which is Consonant_Medial. Uniscribe allows one of those in each of these positions: before matras, after matras and before syllable modifiers, and after syllable modifiers! We currently just allow unlimited numbers of it, before matras.	2013-10-16 19:06:29 +02:00
Behdad Esfahbod	28d5daec94	[indic] More granular post-base cluster merging!	2013-10-16 12:32:12 +02:00
Behdad Esfahbod	9cb59d460e	[indic] Fix cluster merging of left matras The merge_clusters there was totally broken.	2013-10-16 11:34:07 +02:00
Behdad Esfahbod	190c8f2b60	[indic] Adjust cluster merging under uniscribe mode for Tamil Apparently Uniscribe Tamil shaper doesn't ship chubby clusters for Tamil. Adjust to that.	2013-10-16 11:33:18 +02:00
Behdad Esfahbod	f5299eff5c	[indic] Simplify reph logic Shouldn't break anything.	2013-10-15 18:21:32 +02:00
Behdad Esfahbod	65a929b1c0	[indic] If Malayalam dot-reph formed a ligature, don't move it Rachana-0.6 implements dot-reph by ligation, so we shouldn't move it. Uniscribe doesn't either. Test case: U+0D4E,U+0D1A,U+0D4D,U+0D1A,U+0D4D	2013-10-15 18:21:32 +02:00
Behdad Esfahbod	a01cbf6cbe	[indic] Harmless reordering of Khmer features!	2013-10-15 18:21:32 +02:00
Behdad Esfahbod	eb10233b26	[indic] Apply 'kern' for all scripts except for Khmer in Uniscribe mode Seems to better match Uniscribe. Note: NotoSansTelugu-Regular has kern feature, so this fixes most of the positioning failures there, except for the kern pairs blocked by a (non-)joiner, in which case we (correctly) kern, but Uniscribe doesn't.	2013-10-15 18:21:32 +02:00
Behdad Esfahbod	30145272a7	[indic] Don't apply presentation features across syllables More like Uniscribe... We still allow user-defined features to work across syllables, but not pres,blws,abs,psts,etc. This "regressed" Sinhala numbers by 11. These are cases were there's Consonant followed by Ra,Halant,ZWJ at the of text. The Ra,Halant,ZWJ ends up forming reph, which is wrong... But before we were also ligating that reph with the previous consonant. That's even more wrong. That's also what Uniscribe does. Current numbers: BENGALI: 353732 out of 354188 tests passed. 456 failed (0.128745%) DEVANAGARI: 707307 out of 707394 tests passed. 87 failed (0.0122987%) GUJARATI: 366349 out of 366457 tests passed. 108 failed (0.0294714%) GURMUKHI: 60732 out of 60747 tests passed. 15 failed (0.0246926%) KANNADA: 951030 out of 951913 tests passed. 883 failed (0.0927606%) KHMER: 299070 out of 299124 tests passed. 54 failed (0.0180527%) MALAYALAM: 1048140 out of 1048334 tests passed. 194 failed (0.0185056%) ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%) SINHALA: 271655 out of 271847 tests passed. 192 failed (0.070628%) TAMIL: 1091753 out of 1091754 tests passed. 1 failed (9.15957e-05%) TELUGU: 970555 out of 970573 tests passed. 18 failed (0.00185457%)	2013-10-15 18:20:59 +02:00
Behdad Esfahbod	3c7b3641cf	[indic] Handle Avagraha It can come either at the end(ish!) of the syllable, or independently. When independent, it accepts a few bits and pieces.	2013-10-15 13:14:31 +02:00
Behdad Esfahbod	8acbb6be27	[indic] Some scripts like blwf applied to pre-base characters ...while some don't! Improved Bengali, Devanagari, Gurmukhi, Malayalam. Updated numbers: BENGALI: 353732 out of 354188 tests passed. 456 failed (0.128745%) DEVANAGARI: 707307 out of 707394 tests passed. 87 failed (0.0122987%) GUJARATI: 366349 out of 366457 tests passed. 108 failed (0.0294714%) GURMUKHI: 60732 out of 60747 tests passed. 15 failed (0.0246926%) KANNADA: 951030 out of 951913 tests passed. 883 failed (0.0927606%) KHMER: 299070 out of 299124 tests passed. 54 failed (0.0180527%) MALAYALAM: 1048134 out of 1048334 tests passed. 200 failed (0.0190779%) ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%) SINHALA: 271666 out of 271847 tests passed. 181 failed (0.0665816%) TAMIL: 1091753 out of 1091754 tests passed. 1 failed (9.15957e-05%) TELUGU: 970555 out of 970573 tests passed. 18 failed (0.00185457%)	2013-10-15 12:29:07 +02:00
Behdad Esfahbod	bdd8873fd8	Revert "[Indic] don't apply 'calt' by default in Indic shaper" This reverts commit `952121007c`. In light of discussion on the mailing list...	2013-08-07 17:58:25 -04:00
Jonathan Kew	952121007c	[Indic] don't apply 'calt' by default in Indic shaper	2013-08-06 10:36:14 -04:00
Behdad Esfahbod	9245e98742	[Indic] Add Javanese config We should add for other scripts too, send me the virama codepoint and script name...	2013-06-26 20:57:58 -04:00
Behdad Esfahbod	a8cf7b43fa	[Indic] Futher adjust ZWJ handling in Indic-like shapers After the Ngapi hackfest work, we were assuming that fonts won't use presentation features to choose specific forms (eg. conjuncts). As such, we were using auto-joiner behavior for such features. It proved to be troublesome as many fonts used presentation forms ('pres') for example to form conjuncts, which need to be disabled when a ZWJ is inserted. Two examples: U+0D2F,U+200D,U+0D4D,U+0D2F with kartika.ttf U+0995,U+09CD,U+200D,U+09B7 with vrinda.ttf What we do now is to never do magic to ZWJ during GSUB's main input match for Indic-style shapers. Note that backtrack/lookahead are still matched liberally, as is GPOS. This seems to be an acceptable compromise. As to the bug that initially started this work, that one needs to be fixed differently: Bug 58714 - Kannada u+0cb0 u+200d u+0ccd u+0c95 u+0cbe does not provide same results as Windows8 https://bugs.freedesktop.org/show_bug.cgi?id=58714 New numbers: BENGALI: 353689 out of 354188 tests passed. 499 failed (0.140886%) DEVANAGARI: 707305 out of 707394 tests passed. 89 failed (0.0125814%) GUJARATI: 366349 out of 366457 tests passed. 108 failed (0.0294714%) GURMUKHI: 60706 out of 60747 tests passed. 41 failed (0.067493%) KANNADA: 951030 out of 951913 tests passed. 883 failed (0.0927606%) KHMER: 299070 out of 299124 tests passed. 54 failed (0.0180527%) LAO: 53611 out of 53644 tests passed. 33 failed (0.0615167%) MALAYALAM: 1048102 out of 1048334 tests passed. 232 failed (0.0221304%) ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%) SINHALA: 271666 out of 271847 tests passed. 181 failed (0.0665816%) TAMIL: 1091753 out of 1091754 tests passed. 1 failed (9.15957e-05%) TELUGU: 970555 out of 970573 tests passed. 18 failed (0.00185457%) TIBETAN: 208469 out of 208469 tests passed. 0 failed (0%)	2013-03-19 06:22:06 -04:00
Behdad Esfahbod	fb7c182bf9	[Indic] Minor	2013-03-06 00:53:24 -05:00
Behdad Esfahbod	8144936d07	[Indic] Work around fonts with broken new-spec tables See comments, and this thread: http://lists.freedesktop.org/archives/harfbuzz/2013-March/002990.html Originally reported here: https://code.google.com/p/chromium/issues/detail?id=96143 Doesn't change test suite numbers.	2013-03-05 20:08:59 -05:00
Behdad Esfahbod	41732f1fe3	[Indic] Help compiler put indic_features table in .rodata The overridden "or" operator was preventing the flag expression from being const, and putting the table in .data instead or .rodata.	2013-02-27 20:40:54 -05:00
Behdad Esfahbod	94789fd601	[Indic] Sort pre-base reordering consonants with post-forms Before, we were marking them as below-form for initial reordering. However, there is a rule that says "post consonants should follow below consonsnts" for base determination purposes. Malayalam has port-form YA/VA, and RA is pre-base. As such, for a sequence like YA,Virama,YA,Virama,RA, the correct base is at index 0. But because the code was seeing RA as a below-base, it was stopping at the second YA as base, instead of jumping it as a post-base. By treating prebase-reordering consonants like post-forms, this is fixed. MALAYALAM went down from 351 to 265. Other numbers didn't change: BENGALI: 353686 out of 354188 tests passed. 502 failed (0.141733%) DEVANAGARI: 707305 out of 707394 tests passed. 89 failed (0.0125814%) GUJARATI: 366262 out of 366457 tests passed. 195 failed (0.0532122%) GURMUKHI: 60706 out of 60747 tests passed. 41 failed (0.067493%) KANNADA: 950680 out of 951913 tests passed. 1233 failed (0.129529%) KHMER: 299074 out of 299124 tests passed. 50 failed (0.0167155%) LAO: 53611 out of 53644 tests passed. 33 failed (0.0615167%) MALAYALAM: 1048069 out of 1048334 tests passed. 265 failed (0.0252782%) ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%) SINHALA: 271539 out of 271847 tests passed. 308 failed (0.113299%) TAMIL: 1091753 out of 1091754 tests passed. 1 failed (9.15957e-05%) TELUGU: 970555 out of 970573 tests passed. 18 failed (0.00185457%) TIBETAN: 208469 out of 208469 tests passed. 0 failed (0%)	2013-02-26 21:22:37 -05:00
Behdad Esfahbod	cfc507c543	[Indic-like] Disable automatic joiner handling for basic shaping features Not for Arabic, but for Indic-like scripts. ZWJ/ZWNJ have special meanings in those scripts, so let font lookups take full control. This undoes the regression caused by automatic-joiners handling introduced two commits ago. We only disable automatic joiner handling for the "basic shaping features" of Indic, Myanmar, and SEAsian shapers. The "presentation forms" and other features are still applied with automatic-joiner handling. This change also changes the test suite failure statistics, such that a few scripts show more "failures". The most affected is Kannada. However, upon inspection, we believe that in most, if not all, of the new failures, we are producing results superior to Uniscribe. Hard to count those! Here's an example of what is fixed by the recent joiner-handling changes: https://bugs.freedesktop.org/show_bug.cgi?id=58714 New numbers, for future reference: BENGALI: 353892 out of 354188 tests passed. 296 failed (0.0835714%) DEVANAGARI: 707336 out of 707394 tests passed. 58 failed (0.00819911%) GUJARATI: 366262 out of 366457 tests passed. 195 failed (0.0532122%) GURMUKHI: 60706 out of 60747 tests passed. 41 failed (0.067493%) KANNADA: 950680 out of 951913 tests passed. 1233 failed (0.129529%) KHMER: 299074 out of 299124 tests passed. 50 failed (0.0167155%) LAO: 53611 out of 53644 tests passed. 33 failed (0.0615167%) MALAYALAM: 1047983 out of 1048334 tests passed. 351 failed (0.0334817%) ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%) SINHALA: 271539 out of 271847 tests passed. 308 failed (0.113299%) TAMIL: 1091753 out of 1091754 tests passed. 1 failed (9.15957e-05%) TELUGU: 970555 out of 970573 tests passed. 18 failed (0.00185457%) TIBETAN: 208469 out of 208469 tests passed. 0 failed (0%)	2013-02-14 13:10:54 -05:00
Behdad Esfahbod	ec5448667b	Add hb_ot_map_feature_flags_t Code cleanup. No (intended) functional change.	2013-02-14 12:53:57 -05:00
Behdad Esfahbod	e7ffcfafb1	Clean-up add_bool_feature	2013-02-14 11:58:13 -05:00
Behdad Esfahbod	1f91c39677	Indent	2013-02-13 09:38:40 -05:00

1 2 3 4 5 ...

302 Commits