Commit Graph

76 Commits

Author SHA1 Message Date
Behdad Esfahbod c52ddab72e [arabic] Make ZWJ prevent ligatures instead of facilitating it
Unicode 6.2.0 Section 16.2 / Figure 16.3 says:

"For backward compatibility, between Arabic characters a ZWJ acts just
like the sequence <ZWJ, ZWNJ, ZWJ>, preventing a ligature from forming
instead of requesting the use of a ligature that would not normally be
used. As a result, there is no plain text mechanism for requesting the
use of a ligature in Arabic text."

As such, we flip internal zwj to zwnj flags for GSUB matching, which
means it will block ligation in all features, unless the font
explicitly matches U+200D glyph.  This doesn't affect joining behavior.
2013-10-16 13:42:38 +02:00
Behdad Esfahbod 580d5eb93a Don't apply 'dlig' by default
Windows 8 doesn't, and the spec will be fixed.
2013-08-04 16:55:21 -04:00
Behdad Esfahbod 127daf15e0 Arabic mark width-zeroing regression
Mozilla Bug 873902 - Display Arabic text with diacritics is bad
https://bugzilla.mozilla.org/show_bug.cgi?id=873902
2013-05-20 09:11:35 -04:00
Behdad Esfahbod f368ba4a9e [Arabic] Zero marks by GDEF, not Unicode category
Testing shows that this is closer to what Uniscribe does.

Reported by Khaled Hosny:

"""
commit 568000274c
...
This commit is causing a regression with Amiri, the string “هَٰذ” with
Uniscribe and HarfBuzz before this commit, gives:

	[uni0630.fina=3+965|uni0670.medi=0+600|uni064E=0@-256,0+0|uni0647.init=0+926]

But now it gives:

	[uni0630.fina=3+965|uni0670.medi=0+0|uni064E=0@-256,0+0|uni0647.init=0+926]

i.e. uni0670.medi is zeroed though it has a base glyph GDEF class.
"""

The test case is U+0647,U+064E,U+0670,U+0630 with Amiri.
2013-04-04 14:25:36 -04:00
Behdad Esfahbod c2a1cdc4c4 [Arabic] Fix shaping of left-joining 'Phags-Pa U+A872
This is the first character in Unicode to have Arabic left-joining
behavior.  Update the machine to recognize that.

Test case: U+A840,U+A872,U+A840.
2013-02-15 09:27:02 -05:00
Behdad Esfahbod ec5448667b Add hb_ot_map_feature_flags_t
Code cleanup.  No (intended) functional change.
2013-02-14 12:53:57 -05:00
Behdad Esfahbod e7ffcfafb1 Clean-up add_bool_feature 2013-02-14 11:58:13 -05:00
Behdad Esfahbod 568000274c Adjust mark advance-width zeroing logic for Myanmar
Before, we were zeroing advance width of attached marks for
non-Indic scripts, and not doing it for Indic.

We have now three different behaviors, which seem to better
reflect what Uniscribe is doing:

  - For Indic, no explicit zeroing happens whatsoever, which
    is the same as before,

  - For Myanmar, zero advance width of glyphs marked as marks
    *in GDEF*, and do that *before* applying GPOS.  This seems
    to be what the new Win8 Myanmar shaper does,

  - For everything else, zero advance width of glyphs that are
    from General_Category=Mn Unicode characters, and do so
    before applying GPOS.  This seems to be what Uniscribe does
    for Latin at least.

With these changes, positioning of all tests matches for Myanmar,
except for the glitch in Uniscribe not applying 'mark'.  See preivous
commit.
2013-02-12 09:44:57 -05:00
Behdad Esfahbod 0beb66e3a6 Fix warnings 2012-12-05 19:14:28 -05:00
Behdad Esfahbod ba82325b7a Add note re 'Phags-pa letter U+A872, which is Joining_Type=L 2012-11-14 15:36:53 -08:00
Behdad Esfahbod 865745b5b8 Don't do fallback positioning for Indic and Thai shapers 2012-11-14 13:48:26 -08:00
Behdad Esfahbod 5669a6cf41 [Arabic] Fix post-context handling
Ouch!
2012-11-13 15:11:51 -08:00
Behdad Esfahbod 0c7df22228 Add buffer flags
New API:

	hb_buffer_flags_t

	HB_BUFFER_FLAGS_DEFAULT
	HB_BUFFER_FLAG_BOT
	HB_BUFFER_FLAG_EOT
	HB_BUFFER_FLAG_PRESERVE_DEFAULT_IGNORABLES

	hb_buffer_set_flags()
	hb_buffer_get_flags()

We use the BOT flag to decide whether to insert dottedcircle if the
first char in the buffer is a combining mark.

The PRESERVE_DEFAULT_IGNORABLES flag prevents removal of characters like
ZWNJ/ZWJ/...
2012-11-13 14:42:35 -08:00
Behdad Esfahbod 0736915b8e [Indic] Decompose Sinhala split matras the way old HarfBuzz / Pango did
Had to do some refactoring to make this happen...

Under uniscribe bug compatibility mode, we still plit them
Uniscrie-style, but Jonathan and I convinced ourselves that there is no
harm doing this the Unicode way.  This change makes that happen, and
unbreaks free Sinhala fonts.
2012-11-13 12:35:35 -08:00
Behdad Esfahbod 4899801155 U+A872 PHAGS-PA SUPERFIXED LETTER RA is "Right"-Joining 2012-11-08 15:08:26 -08:00
Behdad Esfahbod 22a685836a Adjust Mongolian shaping
For U+1880..U+1886 Uniscribe thinks they are non-joining.
For U+1887 Uniscribe thinks it's joining, but looks wrong to me.
For now, match Uniscribe.
2012-11-05 15:20:10 -08:00
Behdad Esfahbod 3ba7bc14ea Implement 'Phags-pa shaping
Through the Arabic shaper.  It's similar to Mongolian.
2012-11-01 20:05:04 -07:00
Behdad Esfahbod 937f8d3871 [Arabic] Enable dlig and mset for Arabic
That's what the spec says, and what Uniscribe does.
2012-10-29 21:49:33 -07:00
Behdad Esfahbod bdc2fc8294 [Arabic] Respect Arabic joining from neighboring context
Now we respect Arabic joining across runs.
2012-09-25 21:32:35 -04:00
Behdad Esfahbod fabd3113a9 [OT] Port Arabic fallback shaping to synthetic GSUB
All of init/medi/fina/isol and rlig implemented.

Let there be dragons... ⻯
2012-09-06 00:51:44 -04:00
Behdad Esfahbod 2bd9fe3598 Refactor 2012-09-04 15:15:19 -04:00
Behdad Esfahbod a5ddd9e31c [OT] Really fix possible NULL dereference this time 2012-09-04 14:55:00 -04:00
Behdad Esfahbod 2fcbbdb41a Port Arabic fallback ligating to share code with GSUB
This will eventually allow us to skip marks, as well as (fallback)
attach marks to ligature components of fallback-shaped Arabic.
That would be pretty cool.  I kludged GDEF props in, so mark-skipping
works, but the produced ligature id/components will be cleared later
by substitute_start() et al.

Perhaps using a synthetic table for Arabic fallback shaping was a better
idea.  The current approach has way too many layering violations...
2012-08-29 14:01:22 -04:00
Behdad Esfahbod dc5df5af6b Revert "Minor"
This reverts commit 3e0a03978b.

I know remember why that line is there :).
2012-08-28 16:31:23 -04:00
Behdad Esfahbod 3e0a03978b Minor 2012-08-27 17:10:02 -04:00
Behdad Esfahbod 2f1747ed7d Add comment 2012-08-16 11:46:46 -04:00
Behdad Esfahbod bd08d5d126 [OT] Fix Arabic shaper OOB access
https://bugzilla.mozilla.org/show_bug.cgi?id=782908
2012-08-16 11:35:50 -04:00
Behdad Esfahbod 9f9f04c222 [OT] Unbreak Thai shaping and fallback Arabic shaping
The merger of normalizer and glyph-mapping broke shapers that
modified text stream.  Unbreak them by adding a new preprocess_text
shaping stage that happens before normalizing/cmap and disallow
setup_mask modification of actual text.
2012-08-11 18:34:13 -04:00
Behdad Esfahbod e9f28a38f5 [OT] Add shape_plan to Arabic shaper 2012-08-11 18:20:54 -04:00
Behdad Esfahbod cd0c6e148f Shuffle buffer variable allocations around
To room for more allocations, coming.
2012-08-09 21:48:55 -04:00
Behdad Esfahbod a8c6da90f4 [OT] Add per-complex-shaper shape_plan data
Hookup some Indic data to it.  More to come.
2012-08-02 10:46:34 -04:00
Behdad Esfahbod 3e38c0f288 More massaging 2012-08-02 09:44:18 -04:00
Behdad Esfahbod 16c6a27b4b [OT] Port complex_shaper to planner/plan 2012-08-02 09:38:28 -04:00
Behdad Esfahbod 8fbfda920e Inline font getters 2012-08-01 19:03:46 -04:00
Behdad Esfahbod 1e7d860613 [GPOS] Adjust mark advance-width zeroing logic
If there is no GPOS, zero mark advances.

If there *is* GPOS and the shaper requests so, zero mark advances for
attached marks.

Fixes regression with Tibetan, where the font has GPOS, and marks a
glyph as mark where it shouldn't get zero advance.
2012-07-31 23:41:06 -04:00
Behdad Esfahbod 2bc3b9a616 [OT] Zero mark advances if the shaper desires so
Enabled for all shapers except for Indic.
2012-07-31 23:17:22 -04:00
Behdad Esfahbod 693918ef85 [OT] Streamline complex shaper enumeration
Add a shaper class struct.
2012-07-30 21:08:51 -04:00
Behdad Esfahbod 7e34601ded Unbreak Hangul jamo composition
When we removed the separate Hangul shaper, the specific normalization
preference of Hangul was lost.  Fix that.  Also, the Thai shaper was
copied from Hangul, so had the fully-composed normalization behavior,
which was unnecessary.  So, fix that too.
2012-07-30 14:53:41 -04:00
Behdad Esfahbod d96838ef95 Allow complex shapers overriding common features
In a new callback...  Currently unused by all complex shapers.
2012-07-16 20:26:57 -04:00
Behdad Esfahbod 9f377ed321 Fix more unused-var warnings 2012-05-13 16:13:44 +02:00
Behdad Esfahbod 99c2695759 Add accessort to buffer for current info, current pos, and prev info 2012-05-13 15:45:18 +02:00
Behdad Esfahbod ef24cc8c8e [Indic] Towards multi-cluster syllables and final reordering 2012-05-09 18:10:20 +02:00
Behdad Esfahbod d1deaa2f5b Replace zerowidth invisible chars with a zero-advance space glyph
Like Uniscribe does.
2012-05-09 15:04:13 +02:00
Behdad Esfahbod 8e3715f8a1 Minor 2012-04-23 22:18:54 -04:00
Behdad Esfahbod 4a1e02ef79 Fix shape to presentation forms font check
As reported by Jonathan Kew on the list.
2012-04-11 14:37:53 -04:00
Behdad Esfahbod acd88e659f In Arabic fallback shaping, check that the font has glyph for new char 2012-04-10 18:02:20 -04:00
Behdad Esfahbod 939c010211 Implement Arabic fallback shaping mandatory ligatures 2012-04-10 17:20:05 -04:00
Behdad Esfahbod b7d04eb606 Do Arabic fallback shaping 2012-04-10 16:44:38 -04:00
Behdad Esfahbod 11138ccff7 Add normalize mode
In preparation for Hangul shaper.
2012-04-05 17:25:19 -04:00
Behdad Esfahbod a4cbd03dd1 Apply 'locl' with 'ccmp' in Arabic shaper
According to Peter Constable this is indeed what Uniscribe has been
doing for years.

Mozilla Bug 667166 - wrong shape of letter when it comes at the end of
word in the arabic version of Firefox 5.0
2011-08-15 09:52:05 +02:00