Commit Graph

2816 Commits

Author SHA1 Message Date
Behdad Esfahbod dc61294aa9 [unicode7] Add missing ISO 15924 tags 2014-06-18 12:22:45 -04:00
Behdad Esfahbod 7526373e70 [coretext] Remove unused var 2014-06-17 11:45:26 -04:00
Jonathan Kew 798e4185bc When zeroing mark widths for LTR, also adjust offset...
...so that they overstrike preceding glyph.

https://github.com/behdad/harfbuzz/pull/43
2014-06-12 18:34:15 -04:00
Jonathan Kew 80f7405a52 [Thai] set the correct general category on Nikhahit when decomposing Sara-Am. 2014-06-12 18:25:58 -04:00
Behdad Esfahbod 1d634cbb4b Fix base-position when 'pref' is NOT formed
If pre-base reordering Ra is NOT formed (or formed and then
broken up), we should consider that Ra as base.  This is
observable when there's a left matra or dotreph that positions
before base.

Now, it might be that we shouldn't do this if the Ra happend
to form a below form.  We can't quite deduce that right now...

Micro test added.  Also at:

https://code.google.com/a/google.com/p/noto-alpha/issues/detail?id=186#c29
2014-06-12 17:10:35 -04:00
Behdad Esfahbod 04dc52fa15 [indic] Recover OT_H undergone ligation and multiplication
Sometimes font designers form half/pref/etc consonant forms
unconditionally and then undo that conditionally.  Try to
recover the OT_H classification in those cases.

No test number changes expected.
2014-06-09 14:20:06 -04:00
Behdad Esfahbod 39c8201f8e [indic] Improve base re-finding
No test numbers change.
2014-06-09 14:20:06 -04:00
Behdad Esfahbod c04d5f0dd2 [indic] Minor 2014-06-09 14:20:06 -04:00
Behdad Esfahbod 832a6f99b3 [indic] Don't reorder reph/pref if ligature was expanded
Normally if you want to, say, conditionally prevent a 'pref', you
would use blocking contextual matching.  Some designers instead
form the 'pref' form, then undo it in context.  To detect that
we now also remember glyphs that went through MultipleSubst.

In the only place that this is used, Uniscribe seems to only care
about the "last" transformation between Ligature and Multiple
substitions.  Ie. if you ligate, expand, and ligate again, it
moves the pref, but if you ligate and expand it doesn't.  That's
why we clear the MULTIPLIED bit when setting LIGATED.

Micro-test added.  Test: U+0D2F,0D4D,0D30 with font from:

[1]
https://code.google.com/a/google.com/p/noto-alpha/issues/detail?id=186#c29
2014-06-05 20:36:01 -04:00
Behdad Esfahbod b5be231720 [gsub] Adjust single-length ligature subst to act like single subst 2014-06-05 19:00:22 -04:00
Behdad Esfahbod aae69451df [gsub] Minor shuffling 2014-06-05 19:00:08 -04:00
Behdad Esfahbod b6b304f12b [ot] Add TODO re zero-len MultipleSubst sequences 2014-06-05 17:12:54 -04:00
Behdad Esfahbod f1a72fe7bf [ot-font] Fix cmap EncodingRecord cmp order 2014-06-04 19:03:16 -04:00
Behdad Esfahbod ce34f0b07e [ot-font] Use binary search for format12 cmap subtable 2014-06-04 18:57:46 -04:00
Behdad Esfahbod 257d1adfa1 [ot-font] Work around broken cmap subtable format 4 length
Roboto was hitting this.  FreeType also has pretty much the
same code for this, in ttcmap.c:tt_cmap4_validate():

    /* in certain fonts, the `length' field is invalid and goes */
    /* out of bound.  We try to correct this here...            */
    if ( table + length > valid->limit )
    {
      if ( valid->level >= FT_VALIDATE_TIGHT )
        FT_INVALID_TOO_SHORT;

      length = (FT_UInt)( valid->limit - table );
    }
2014-06-04 18:47:55 -04:00
Behdad Esfahbod 51f563579b Move try_set to sanitize context 2014-06-04 18:42:32 -04:00
Behdad Esfahbod 500737e8e1 [ot-font] Don't select a Null cmap subtable
Can happen either in broken fonts, or as a result of sanitize().
2014-06-04 18:17:29 -04:00
Behdad Esfahbod dac86026a6 Fix some cppcheck warnings
Bug 77800 - cppcheck reports
2014-06-03 17:57:00 -04:00
Behdad Esfahbod c306410cab Bug 77732 - Fix unused typedef warning for ASSERT_STATIC with GCC 4.8 2014-06-03 17:00:07 -04:00
Behdad Esfahbod ae2b854eab Move code around 2014-06-03 16:59:09 -04:00
Behdad Esfahbod 17c3b809f4 [indic] Treat U+A8E0..A8F1 as OT_A instead of OT_VD
Apparently they can intermix with other OT_A.

Test: U+0915,A8E2,1CD0
2014-06-02 15:08:18 -04:00
Behdad Esfahbod 6ae13f257c [graphite2] Fix cluster mapping
Patch from Martin Hosken.  I expect this to fix the following bugs:

https://bugs.freedesktop.org/show_bug.cgi?id=75076
https://bugzilla.gnome.org/show_bug.cgi?id=723582
https://bugzilla.redhat.com/show_bug.cgi?id=998812
2014-05-30 17:38:14 -04:00
Behdad Esfahbod 7977ca17aa [indic] Allow decimal and Brahmi digits as placeholders
Tests: U+0967,0951 U+0031,093F
2014-05-29 15:34:26 -04:00
Behdad Esfahbod e8b5d64039 [indic] Do NOT allow reph formation on placeholders
Only allow it on DOTTED CIRCLE.  No effect on test numbers.

Test: U+0930,094D,00A0
2014-05-29 15:20:15 -04:00
Behdad Esfahbod 52b562a6a0 [indic] Clean up a bit
No functional change intended.
2014-05-27 18:19:52 -04:00
Behdad Esfahbod 3bf652b907 [indic] Treat U+002D and U+2010..2014 as placeholders 2014-05-27 18:07:26 -04:00
Behdad Esfahbod e0de95f402 [indic] Treat U+00D7 MULTIPLICATION SIGN as placeholder 2014-05-27 17:58:34 -04:00
Behdad Esfahbod cf78dd483c [indic/myanmar] Rename OT_NBSP to OT_PLACEHOLDER 2014-05-27 17:53:37 -04:00
Behdad Esfahbod 186ece94c8 [myanmar] Use OT_NBSP instead of OT_DOTTEDCIRCLE for OT_GB
No functional change.
2014-05-27 17:49:45 -04:00
Behdad Esfahbod cf71d28c38 [indic/myanmar] Refactor a few macros 2014-05-27 17:47:43 -04:00
Behdad Esfahbod 2307268e01 [indic] Treat U+0A72..0A73 like regular consonants
Unicode 6.x IndicSyllableCategory categorizes them as
placeholders, but they can subjoin.
2014-05-27 17:39:01 -04:00
Behdad Esfahbod e9b2a4cfe5 [indic] Support U+1CED 2014-05-23 15:49:10 -04:00
Behdad Esfahbod d19f8e8570 [indic] Support U+A8F2..A8F7,1CE9..1CEC,1CEE..1CF1 2014-05-23 15:47:36 -04:00
Behdad Esfahbod ddbdfcbf1c [indic] Simplify grammar
No functional change.
2014-05-23 15:39:55 -04:00
Behdad Esfahbod 4e9b1f662b [indic] Always start new syllable for Avagraha
In fact, the previous grammar was ambigious.  No functional
change.
2014-05-23 15:38:42 -04:00
Behdad Esfahbod 9f9bd9bf31 [indic] Rename avagraha cluster to symbol cluster
In anticipation of adding more characters to that class of clusters.
2014-05-23 15:35:38 -04:00
Behdad Esfahbod a498565ced [indic] Support U+1CF2,U+1CF3 2014-05-22 19:39:56 -04:00
Behdad Esfahbod ecb98babba [indic] Support U+1CE2..U+1CE8 2014-05-22 19:36:21 -04:00
Behdad Esfahbod 37bf2c9224 Minor 2014-05-22 19:35:17 -04:00
Behdad Esfahbod 131e17ff9a [indic] Support U+1CF5,1CF6 2014-05-22 19:33:10 -04:00
Behdad Esfahbod 72ead0cc72 [indic] Treat U+1CE1 as a tone-mark too
It's spacing, but otherwise the same as the other ones.
2014-05-22 19:12:10 -04:00
Behdad Esfahbod e848bfae7c [indic] Recategorize U+A8E0..A8F1 as OT_VD
Up to two of them come after all OT_A characters.
2014-05-22 18:50:34 -04:00
Behdad Esfahbod c519536c34 [indic] Allow up to three tone marks
According to Roozbeh, there are valid combinations in Unicode
proposals for up to three.  Previously we were allowing up to two.
2014-05-22 18:43:14 -04:00
Behdad Esfahbod c11fc68339 [indic] Support more extended Devanagari tone marks
Also adjust U+0953,0954 handling.
2014-05-22 18:41:49 -04:00
Behdad Esfahbod 26c836e53d [indic] Handle "Cantillation marks for the Samaveda" 2014-05-21 18:35:48 -04:00
Behdad Esfahbod 29531128f2 [indic] Improve reph formation of Sinhala and Telugu
Sinhala and Telugu use "explicit" reph.  That is, the reph is formed by
a Ra,H,ZWJ sequence.  Previously, upon detecting this sequence, we were
checking checking whether the 'rphf' feature applies to the first two
glyphs of the sequence.  This is how the Microsoft fonts are designed.
However, testing with Noto shows that apparently Uniscribe also forms
the reph if the lookup ligates all three glyphs.  So, try both
sequences.

Doesn't affect test results for Sinhala or Telugu.

https://code.google.com/a/google.com/p/noto-alpha/issues/detail?id=232
2014-05-15 14:04:02 -06:00
Oleg Oshmyan 8c703f13bf Fix build with --coretext on older OS X
Fixes https://github.com/behdad/harfbuzz/pull/40
2014-05-14 17:42:20 -06:00
Behdad Esfahbod 439b05867c [myanmar] Allow MedialYa+Asat in the grammar
The grammar in the OT spec, and the existing Windows implementation
seem to be confused around where to allow Asat around the medial
consonants.

The previous grammar for medial group was allowing an Asat after
the medial group only if there was a medial Wa or Ha, but not if
there was only a medial Ya.  This doesn't make sense to me and
sounds reversed, as both medial Wa and Ha are below marks while
Asat is an above mark.  An Asat can come before the medial group
already (in fact, multiple ones can.  Why?!).  The medial Ya
however is a spacing mark and according to Roozbeh it's valid
to want an Asat on the medial Ya instead of the base, so it looks
to me like we want to allow an Asat after the medial group if
there *was* a Ya but not if there wasn't any.  Not wanting to
produce dotted-circle where Windows is not, this commit changes
the grammar to allow one Asat after the medial group no matter
what comes in the group.

Test: U+1002,103A,103B vs U+1002,103B,103A
2014-05-14 16:44:39 -06:00
Behdad Esfahbod c95587618c [ot] Minor note re cmap subtable format 2 and 8 2014-05-14 00:42:18 -04:00
Behdad Esfahbod b7878cd58e [ot] Implement cmap subtable format 0 2014-05-13 21:47:51 -04:00