Commit Graph

120 Commits

Author SHA1 Message Date
Behdad Esfahbod 7793aad946 Normalize various spaces to space if font doesn't support
This resurrects the space fallback feature, after I disabled
the compatibility decomposition.  Now I can release HarfBuzz
again without breaking Pango!

It also remembers which space character it was, such that later
on we can approximate the width of this particular space
character.  That part is not implemented yet.

We normalize all GC=Zs chars except for U+1680 OGHA SPACE MARK,
which is better left alone.
2015-11-04 15:51:41 -08:00
Behdad Esfahbod 5c8174eda3 Update comments for removal of compat decompositions 2015-10-21 18:51:40 -02:00
Behdad Esfahbod f679970040 Disable compatibility decomposition usage during normalization
Fixes https://github.com/behdad/harfbuzz/issues/152
2015-10-21 17:20:55 -02:00
Behdad Esfahbod 980e25cad2 Fix hb-ot-shape-normalize with empty buffer
Part of https://github.com/behdad/harfbuzz/issues/136
2015-10-02 08:21:12 +01:00
Behdad Esfahbod e995d33c10 [OT] Merge clusters when reordering marks for normalization
Fixes https://bugzilla.gnome.org/show_bug.cgi?id=541608
and cluster test.
2015-09-01 16:13:32 +01:00
Behdad Esfahbod 85846b3de7 Use insertion-sort instead of bubble-sort
Needed for upcoming merge-clusters fix.
2015-09-01 15:07:52 +01:00
ThePhD 5c99cf93d6 Merge branch 'master' into vc++-fixes 2015-08-14 01:02:00 -04:00
jfkthame c7dfe316f8 Don't rely on .cluster in _hb_ot_shape_normalize()
Fixes https://github.com/behdad/harfbuzz/pull/124
2015-08-09 18:26:27 +02:00
ThePhD 8e545d5961 Fix all VC++ warnings and errors in the current commit's builds. 2015-06-22 22:29:04 -04:00
Behdad Esfahbod 1eff435023 Minor optimization 2015-01-27 12:26:04 -08:00
Behdad Esfahbod 8f3eebf7ee Make sure gsubgpos buffer vars are available during fallback_position
Add buffer var allocation asserts to a few key places.
2014-08-02 19:07:49 -04:00
Behdad Esfahbod 5209c50506 Revert "Show U+FFFD REPLACEMENT CHARACTER for invalid Unicode codepoints"
We now handle U+FFFD replacement in hb_buffer_add_utf*().  Any other
manipulation can happen in user callbacks.  No need for this.

efe74214bb (commitcomment-7039404)

This reverts commit efe74214bb.

Conflicts:
	src/hb-ot-shape-normalize.cc
2014-07-17 12:23:44 -04:00
Behdad Esfahbod 7627100f42 Mark unsigned integer literals with the u suffix
Simplifies hb_in_range() calls as the type can be inferred.
The rest is obsessiveness, I admit.
2014-07-11 16:22:13 -04:00
Behdad Esfahbod efe74214bb Show U+FFFD REPLACEMENT CHARACTER for invalid Unicode codepoints
Only if the font doesn't support it.  Ie, this gives the user to
use non-Unicode codepoints as private values and return a meaningful
glyph for them.  But if it's invalid and font callback doesn't
like it, and if font has U+FFFD, show that instead.

Font functions that do not want this automatic replacement to
happen should return true from get_glyph() if unicode > 0x10FFFF.

Replaces https://github.com/behdad/harfbuzz/pull/27
2014-07-11 11:59:48 -04:00
Behdad Esfahbod 08cf5d75ef [ot] Don't try to compose if normalization is off 2014-01-22 07:53:55 -05:00
Behdad Esfahbod 8fc1f7fe74 [ot/hangul] Don't decompose Hangul even when combining marks present
As discussed on
https://github.com/behdad/harfbuzz/pull/10#issuecomment-31442030
2014-01-02 17:04:04 +08:00
Behdad Esfahbod 64426ec73a [ot] Simplify composing
Not tested.  Ouch.
2014-01-02 14:33:10 +08:00
Behdad Esfahbod 3d6ca0d32e [ot] Simplify normalization_preference again
No shaper has more than one behavior re this, so no need for a callback.
2013-12-31 16:35:37 +08:00
Behdad Esfahbod ac8cd51191 Refactor 2013-10-18 19:33:09 +02:00
Behdad Esfahbod 79d1007a50 If variation selector is not consumed by cmap, pass it on to GSUB
This changes the semantics of get_glyph() callback and expect that
callbacks return false if the requested variant is not available, and
then we will call them back with variation_selector=0 and will retain
the glyph for the selector in the glyph stream.

Apparently most Mongolian fonts implement the Mongolian Variation
Selectors using GSUB, not cmap.

https://bugs.freedesktop.org/show_bug.cgi?id=65258

Note that this doesn't fix the Mongolian shaping yet, because the way
that's implemented is that the, say, 'init' feature ligates the letter
and the variation-selector.  However, since currently the variation
selector doesn't have the 'init' mask on, it will not be matched...
2013-06-13 19:01:07 -04:00
Behdad Esfahbod c7a8491720 Skip over multiple variation selectors in a row 2013-06-10 15:08:49 -04:00
Behdad Esfahbod 269de14dda Don't compose Hangul jamo
See thread "an issue regarding discrepancy between Korean and Unicode
standards" on the mailing list for the rationale.  In short: Uniscribe
doesn't, so fonts are designed to work without it.
2013-04-04 23:06:54 -04:00
Behdad Esfahbod a88a62f70f Minor 2013-03-21 21:02:16 -04:00
Behdad Esfahbod 6e74c64211 Improve normalization heuristic
Before, for most scripts, we were not trying to recompose two characters
if the second one had ccc=0.  That fails for Myanmar where U+1026
decomposes to U+1025,U+102E, both of which have ccc=0.  However, we do
want to try to recompose those.  We now check whether the second is a
mark, using general category instead.

At the same time, remove optimization that was conflicting with this.

[Let the Ngapi hackfest begin!]
2013-02-11 12:59:00 -05:00
Behdad Esfahbod eba312c8d1 Plumbing to get shape plan and font into complex decompose function
So we can handle Sinhala split matras smartly...  Coming soon.
2012-11-16 12:58:38 -08:00
Behdad Esfahbod 0736915b8e [Indic] Decompose Sinhala split matras the way old HarfBuzz / Pango did
Had to do some refactoring to make this happen...

Under uniscribe bug compatibility mode, we still plit them
Uniscrie-style, but Jonathan and I convinced ourselves that there is no
harm doing this the Unicode way.  This change makes that happen, and
unbreaks free Sinhala fonts.
2012-11-13 12:35:35 -08:00
Behdad Esfahbod 028a1706f8 Refactor common macro 2012-09-06 14:25:48 -04:00
Behdad Esfahbod b85800f9de [Indic] Implement dotted-circle insertion for broken clusters
No panic, we reeally insert dotted circle when it's absolutely broken.

Fixes most of the dotted-circle cases against Uniscribe. (for Devanagari
fixes 80% of them, for Khmer 70%; the rest look like Uniscribe being
really bogus...)

I had to make a decision.  Apparently Uniscribe adds one dotted circle
to each broken character.  I tried that, but that goes wrong easily with
split matras.  So I made it add only one dotted circle to an entire
broken syllable tail.  As in: "if there was a dotted circle here, this
would have formed a correct cluster."  That works better for split
stuff, and I like it more.
2012-08-31 19:18:20 -04:00
Behdad Esfahbod f4cb476298 [OT] Slightly adjust normalizer
The change is very subtle.  If we have a single-char cluster that
decomposes to three or more characters, then try recomposition, in
case the farther mark may compose with the base.
2012-08-10 03:51:44 -04:00
Behdad Esfahbod 07d6828063 Minor 2012-08-10 03:28:50 -04:00
Behdad Esfahbod b00321ea78 [OT] Avoid calling get_glyph() twice
Essentially move the glyph mapping to normalization process.
The effect on Devanagari is small (but observable).  Should be more
observable in simple text, like ASCII.
2012-08-09 22:33:32 -04:00
Behdad Esfahbod 8d1eef3f32 Minor 2012-08-09 21:35:47 -04:00
Behdad Esfahbod 0f8881d6bb More refactoring 2012-08-07 16:57:02 -04:00
Behdad Esfahbod 428dfcab66 Minor refactoring 2012-08-07 16:51:48 -04:00
Behdad Esfahbod 8fbfda920e Inline font getters 2012-08-01 19:03:46 -04:00
Behdad Esfahbod 208f70f055 Inline Unicode callbacks internally 2012-08-01 17:13:10 -04:00
Behdad Esfahbod 84186a6400 Add commentary on the compatibility decomposition in the normalizer 2012-08-01 13:32:39 -04:00
Behdad Esfahbod 378d279bbf Implement Unicode compatibility decompositions
Based on patch from Philip Withnall.
https://bugs.freedesktop.org/show_bug.cgi?id=41095
2012-07-31 21:36:16 -04:00
Behdad Esfahbod bc8357ea7b Merge clusters during normalization 2012-06-08 21:01:20 -04:00
Behdad Esfahbod 0594a24484 Cleanup TRUE/FALSE vs true/false 2012-06-05 20:35:40 -04:00
Behdad Esfahbod 9f377ed321 Fix more unused-var warnings 2012-05-13 16:13:44 +02:00
Behdad Esfahbod 99c2695759 Add accessort to buffer for current info, current pos, and prev info 2012-05-13 15:45:18 +02:00
Behdad Esfahbod d1deaa2f5b Replace zerowidth invisible chars with a zero-advance space glyph
Like Uniscribe does.
2012-05-09 15:04:13 +02:00
Behdad Esfahbod 29a7e306e3 Minor 2012-04-24 16:01:30 -04:00
Behdad Esfahbod 683b503f30 Minor 2012-04-14 20:47:14 -04:00
Behdad Esfahbod 9683184553 Implement normalization mode HB_OT_SHAPE_NORMALIZATION_MODE_COMPOSED_FULL
In this mode we try composing CCC=0 with CCC=0 characters.  Useful for
Hangul.
2012-04-07 15:06:47 -04:00
Behdad Esfahbod bec2ac4fde Bring normalization algorithm closer to the spec
No logical difference so far.
2012-04-07 14:51:17 -04:00
Behdad Esfahbod e02d925786 Flip logic around 2012-04-07 14:49:13 -04:00
Behdad Esfahbod 11138ccff7 Add normalize mode
In preparation for Hangul shaper.
2012-04-05 17:25:19 -04:00
Behdad Esfahbod 6769f21d57 More moving code around 2012-04-05 16:46:46 -04:00
Behdad Esfahbod e3b2e077f5 Typo 2012-03-07 10:21:24 -05:00
Behdad Esfahbod c346671b6b Minor doc fixes 2012-03-06 20:47:50 -05:00
Behdad Esfahbod af913c5788 Fix infinite loop in normalization code with variation selectors
Reported by Jonathan Kew.
2011-10-17 11:39:28 -07:00
Behdad Esfahbod 55deff7595 Add comments 2011-09-28 16:20:09 -04:00
Behdad Esfahbod 947c9a778c Minor 2011-09-16 16:33:18 -04:00
Behdad Esfahbod 36b10f58cc Minor 2011-09-15 16:29:51 -04:00
Behdad Esfahbod c605bbbb6d Remove C++ guards from source files
Where causing issues for people with MSVC.
2011-08-04 20:00:53 -04:00
Behdad Esfahbod 45d6f29f15 [Indic] Reorder matras
Number of failing shape-complex tests goes from 125 down to 94.

Next: Add Ra handling and it's fair to say we kinda support Indic :).
2011-07-30 14:44:30 -04:00
Behdad Esfahbod c311d85208 Keep Unicode props updated as we go so we avoid a scan later 2011-07-23 23:43:54 -04:00
Behdad Esfahbod 5389ff4dbc Implement the Unicode Canonical Composition algorithm
Fallback normalization is complete and working now!
2011-07-22 20:22:49 -04:00
Behdad Esfahbod dcdc51cdc0 Handle singleton decompositions 2011-07-22 17:14:46 -04:00
Behdad Esfahbod 34c22f8168 Implement Unicode Canonical Reordering Algorithm 2011-07-22 17:04:20 -04:00
Behdad Esfahbod 4ff0d2d9df Decomposition works now! 2011-07-22 16:15:32 -04:00
Behdad Esfahbod 468e9cb25c Move buffer methods into the object 2011-07-22 14:49:14 -04:00
Behdad Esfahbod 45412523dc More normalization kick 2011-07-22 11:07:05 -04:00
Behdad Esfahbod 5d90a342e3 Document normalization design 2011-07-21 15:25:01 -04:00
Behdad Esfahbod d6b9c6d200 More kicking 2011-07-21 12:16:45 -04:00
Behdad Esfahbod 192445aef2 Remove intermittent_glyph()
Lets not worry about performance for now...
2011-07-21 12:13:04 -04:00
Behdad Esfahbod 5c6f5982d7 Towards normalization 2011-07-21 11:31:08 -04:00
Behdad Esfahbod 655586fe5e Towards normalization 2011-07-21 00:52:42 -04:00