Commit Graph

253 Commits

Author SHA1 Message Date
Behdad Esfahbod 1b972d893a [OTLayout] Add is_inplace() method to GSUB 2013-05-02 15:39:16 -04:00
Behdad Esfahbod a8cf7b43fa [Indic] Futher adjust ZWJ handling in Indic-like shapers
After the Ngapi hackfest work, we were assuming that fonts
won't use presentation features to choose specific forms
(eg. conjuncts).  As such, we were using auto-joiner behavior
for such features.  It proved to be troublesome as many fonts
used presentation forms ('pres') for example to form conjuncts,
which need to be disabled when a ZWJ is inserted.

Two examples:

	U+0D2F,U+200D,U+0D4D,U+0D2F with kartika.ttf
	U+0995,U+09CD,U+200D,U+09B7 with vrinda.ttf

What we do now is to never do magic to ZWJ during GSUB's main input
match for Indic-style shapers.  Note that backtrack/lookahead are still
matched liberally, as is GPOS.  This seems to be an acceptable
compromise.

As to the bug that initially started this work, that one needs to
be fixed differently:

  Bug 58714 - Kannada u+0cb0 u+200d u+0ccd u+0c95 u+0cbe does not
  provide same results as Windows8
  https://bugs.freedesktop.org/show_bug.cgi?id=58714

New numbers:

BENGALI: 353689 out of 354188 tests passed. 499 failed (0.140886%)
DEVANAGARI: 707305 out of 707394 tests passed. 89 failed (0.0125814%)
GUJARATI: 366349 out of 366457 tests passed. 108 failed (0.0294714%)
GURMUKHI: 60706 out of 60747 tests passed. 41 failed (0.067493%)
KANNADA: 951030 out of 951913 tests passed. 883 failed (0.0927606%)
KHMER: 299070 out of 299124 tests passed. 54 failed (0.0180527%)
LAO: 53611 out of 53644 tests passed. 33 failed (0.0615167%)
MALAYALAM: 1048102 out of 1048334 tests passed. 232 failed (0.0221304%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271666 out of 271847 tests passed. 181 failed (0.0665816%)
TAMIL: 1091753 out of 1091754 tests passed. 1 failed (9.15957e-05%)
TELUGU: 970555 out of 970573 tests passed. 18 failed (0.00185457%)
TIBETAN: 208469 out of 208469 tests passed. 0 failed (0%)
2013-03-19 06:22:06 -04:00
Behdad Esfahbod 9c5a9ee967 [OTLayout] Rename process() to dispatch() 2013-03-09 01:55:04 -05:00
Behdad Esfahbod 722e8b857e Fixup previous commit
Was not decreasing num_items.  Ouch!
2013-02-21 15:37:51 -05:00
Behdad Esfahbod 2b2a6e8944 [OTLayout] Correctly skip Default_Ignorable when match_func not set
When a match_func was not set on the matcher_t object (ie. from GPOS),
then the Default_Ignorables (including joiners) were never skipped.
This meant that they were not skipped as they should during GPOS
matching.  Fix that.

A few Indic numbers have "regressed": BENGALI and DEVANAGARI went
up from 290 and 58 respectively, but in both cases new results are
superior to Uniscribe, as they apply GPOS when we weren't (and
Uniscribe isn't) before.
BENGALI: 353686 out of 354188 tests passed. 502 failed (0.141733%)
DEVANAGARI: 707305 out of 707394 tests passed. 89 failed (0.0125814%)
GUJARATI: 366262 out of 366457 tests passed. 195 failed (0.0532122%)
GURMUKHI: 60706 out of 60747 tests passed. 41 failed (0.067493%)
KANNADA: 950680 out of 951913 tests passed. 1233 failed (0.129529%)
KHMER: 299074 out of 299124 tests passed. 50 failed (0.0167155%)
LAO: 53611 out of 53644 tests passed. 33 failed (0.0615167%)
MALAYALAM: 1047983 out of 1048334 tests passed. 351 failed (0.0334817%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271539 out of 271847 tests passed. 308 failed (0.113299%)
TAMIL: 1091753 out of 1091754 tests passed. 1 failed (9.15957e-05%)
TELUGU: 970555 out of 970573 tests passed. 18 failed (0.00185457%)
TIBETAN: 208469 out of 208469 tests passed. 0 failed (0%)
2013-02-21 15:07:03 -05:00
Behdad Esfahbod ff93ac8cb2 Minor 2013-02-21 14:51:40 -05:00
Behdad Esfahbod cb90b1bbe6 [OTLayout] Respect syllable boundaries for backtrack/lookahead matching
Originally we meant to match backtrack/lookahead across syllable
boundaries.  But a bug in the code meant that this was NOT done for
backtrack.  We "fixed" that in 2c7d0b6b80,
but that broke Myanmar shaping.

We now believe that for Indic-like shapers (which is where syllables are
used), all basic shaping forms should be fully contained within their
syllables, so now we limit backtrack/lookahead matching to the syllable
too.  Unbreaks Myanmar.
2013-02-15 07:02:08 -05:00
Behdad Esfahbod cfc507c543 [Indic-like] Disable automatic joiner handling for basic shaping features
Not for Arabic, but for Indic-like scripts.  ZWJ/ZWNJ have special
meanings in those scripts, so let font lookups take full control.

This undoes the regression caused by automatic-joiners handling
introduced two commits ago.

We only disable automatic joiner handling for the "basic shaping
features" of Indic, Myanmar, and SEAsian shapers.  The "presentation
forms" and other features are still applied with automatic-joiner
handling.

This change also changes the test suite failure statistics, such that
a few scripts show more "failures".  The most affected is Kannada.
However, upon inspection, we believe that in most, if not all, of the
new failures, we are producing results superior to Uniscribe.  Hard to
count those!

Here's an example of what is fixed by the recent joiner-handling
changes:

  https://bugs.freedesktop.org/show_bug.cgi?id=58714

New numbers, for future reference:

BENGALI: 353892 out of 354188 tests passed. 296 failed (0.0835714%)
DEVANAGARI: 707336 out of 707394 tests passed. 58 failed (0.00819911%)
GUJARATI: 366262 out of 366457 tests passed. 195 failed (0.0532122%)
GURMUKHI: 60706 out of 60747 tests passed. 41 failed (0.067493%)
KANNADA: 950680 out of 951913 tests passed. 1233 failed (0.129529%)
KHMER: 299074 out of 299124 tests passed. 50 failed (0.0167155%)
LAO: 53611 out of 53644 tests passed. 33 failed (0.0615167%)
MALAYALAM: 1047983 out of 1048334 tests passed. 351 failed (0.0334817%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271539 out of 271847 tests passed. 308 failed (0.113299%)
TAMIL: 1091753 out of 1091754 tests passed. 1 failed (9.15957e-05%)
TELUGU: 970555 out of 970573 tests passed. 18 failed (0.00185457%)
TIBETAN: 208469 out of 208469 tests passed. 0 failed (0%)
2013-02-14 13:10:54 -05:00
Behdad Esfahbod 0b45479198 [OTLayout] Add fine-grained control over ZWJ matching
Not used yet.  Next commit...
2013-02-14 13:02:13 -05:00
Behdad Esfahbod 607feb7cff [OTLayout] Ignore default-ignorables when matching GSUB/GPOS
When matching lookups, be smart about default-ignorable characters.
In particular:

Do nothing specific about ZWNJ, but for the other default-ignorables:

If the lookup in question uses the ignorable character in a sequence,
then match it as we used to do.  However, if the sequence match will
fail because the default-ignorable blocked it, try skipping the
ignorable character and continue.

The most immediate thing it means is that if Lam-Alef forms a ligature,
then Lam-ZWJ-Alef will do to.  Finally!

One exception: when matching for GPOS, or for backtrack/lookahead of
GSUB, we ignore ZWNJ too.  That's the right thing to do.

It certainly is possible to build fonts that this feature will result
in undesirable glyphs, but it's hard to think of a real-world case
that that would happen.

This *does* break Indic shaping right now, since Indic Unicode has
specific rules for what ZWJ/ZWNJ mean, and skipping ZWJ is breaking
those rules.  That will be fixed in upcoming commits.
2013-02-14 12:57:50 -05:00
Behdad Esfahbod 4e51df73a3 [OTLayout] Remove unused function 2013-02-14 07:42:42 -05:00
Behdad Esfahbod 8820bb235b [OTLayout] Port apply_lookup to skippy_iter 2013-02-14 07:41:23 -05:00
Behdad Esfahbod dfca269f06 [OTLayout] Port ligate_input to skippy_iter 2013-02-14 07:41:23 -05:00
Behdad Esfahbod 7e53415c2d [OTLayout] Minor fix for apply_lookup()
Should NOT change behavior, since first glyph is a match.
2013-02-14 06:24:30 -05:00
Behdad Esfahbod 6880f7e19d [OTLayout] Make table type known to apply context 2013-02-13 12:17:25 -05:00
Behdad Esfahbod 2c7d0b6b80 [OTLayou] Unbreak backtrack matching
Was introduced by 28b9d502bb.
2013-02-13 12:10:08 -05:00
Behdad Esfahbod c074ebc466 [OTLayout] Minor refactoring 2013-02-13 11:22:42 -05:00
Behdad Esfahbod 407fc12466 [OTLayout] Remove bogus caching of glyph property 2013-02-13 11:13:06 -05:00
Behdad Esfahbod 54f7b4d9ec [OTLayout] Respect lookup-flags skipping over non-mark glyphs
Before, when matching ligatures, we never skipping over base / liga
glyphs even if that was what the LookupFlags asked for.

Fixed now.  We carefully reviewed all instances of this, and tested with
Amiri as well as some Indic scripts, and are confident that this should
NOT break anyone's fonts.  It's also how Uniscribe does it, from what
we can tell.
2013-02-11 13:27:17 -05:00
Behdad Esfahbod 9082efc4aa [OTLayout] s/mark_skipping/skipping/
In aticipation of upcoming changes.
2013-02-11 13:14:56 -05:00
Behdad Esfahbod 7b912c1936 Remove a few unnecessary const's
Apparently helps with MSVC compilation.
2013-01-04 01:25:27 -06:00
Behdad Esfahbod 11fba79ee9 [OTLayout] Fix various introspection issues with ClassDef's
As reported by Jonathan Kew.
2013-01-02 23:36:37 -06:00
Behdad Esfahbod 0beb66e3a6 Fix warnings 2012-12-05 19:14:28 -05:00
Behdad Esfahbod 130bb3f614 Rename VOID and void_t to have HarfBuzz prefix
Fixes build on Windows.  Ouch!
2012-12-05 16:49:47 -05:00
Behdad Esfahbod 4a350d0eb2 [OTLayout] Reuse context in collect_glyphs() recursion 2012-12-04 17:13:09 -05:00
Behdad Esfahbod 8303593ba1 Minor
Use pointers instead of references, in preparation for upcoming change.
2012-12-04 17:08:41 -05:00
Behdad Esfahbod 1bcfa06d11 [OTLayout] Don't recurse in collect_glyphs() for GPOS 2012-12-04 16:58:09 -05:00
Behdad Esfahbod e75943de80 [OTLayout] Fix collect_glyphs() recursion in ContextFormat3 2012-11-30 08:38:24 +02:00
Behdad Esfahbod f18ff5a84d [OTLayout] Return correct value from recursion
Commit 4c4e8f0e75 broke contextual lookups
by making the recurse() function always return false.

Reported by Khaled.  Test case: لا in Amiri.
2012-11-30 08:07:06 +02:00
Behdad Esfahbod 2dc1141d7d [OTLayout] Remove operator() from ClassDef 2012-11-24 19:16:34 -05:00
Behdad Esfahbod b67881b171 [OTLayout] Remove operator() from Coverage 2012-11-24 19:13:55 -05:00
Behdad Esfahbod 1ea375da44 [OTLayout] Only collect output glyphs during recursion in collect_glyphs() 2012-11-24 02:05:52 -05:00
Behdad Esfahbod f1b12781d2 [OTLayout] Implement ChainContext collect_glyphs()
All of collect_glyphs() complete and untested now.
2012-11-24 02:02:01 -05:00
Behdad Esfahbod 4c4e8f0e75 [OTLayout] Reuse apply context for recursion 2012-11-24 01:13:20 -05:00
Behdad Esfahbod 53a69f49e5 [OTLayout] Remove unused members 2012-11-24 01:03:05 -05:00
Behdad Esfahbod d0a5233785 [OTLayout] Implement Context::collect_glyphs() 2012-11-23 18:54:59 -05:00
Behdad Esfahbod 26514d51b6 [OTLayout] More collect_glyphs() 2012-11-23 18:13:48 -05:00
Behdad Esfahbod 9b34677f36 [OTLayout] Clean up closure() a bit 2012-11-23 17:55:40 -05:00
Behdad Esfahbod 2c53bd3c3e [OTLayout] Start porting sanitize() to process() 2012-11-23 17:29:05 -05:00
Behdad Esfahbod f48ec0e834 [OTLayout] Add process() tracing 2012-11-23 17:23:41 -05:00
Behdad Esfahbod ed2e135944 [OTLayout] More Extension templatizing 2012-11-23 17:10:40 -05:00
Behdad Esfahbod 7dddd4e72b [OTLayout] More templatizing Extension 2012-11-23 17:04:55 -05:00
Behdad Esfahbod 653eeb2645 Make Extension a template 2012-11-23 16:57:36 -05:00
Behdad Esfahbod a1733db1c6 [OTLayout] Start adding process() tracing 2012-11-23 16:40:04 -05:00
Behdad Esfahbod 73c18ae1b9 Cleanup 2012-11-23 15:34:11 -05:00
Behdad Esfahbod be218c688c Pass this object to trace macros 2012-11-23 15:32:14 -05:00
Behdad Esfahbod 902cc8aca0 [OTLayout] Start unbreaking tracing 2012-11-23 15:23:30 -05:00
Behdad Esfahbod dabe698fcb Minor 2012-11-23 14:21:35 -05:00
Behdad Esfahbod c779d82b2f Fix warnings 2012-11-23 14:09:21 -05:00
Behdad Esfahbod 81822528ef Minor 2012-11-23 13:27:16 -05:00