Commit Graph

2229 Commits

Author SHA1 Message Date
Behdad Esfahbod dde5506fd9 [Indic] Don't move virama with left matra
This is important for the Sinhala U+0DDA split matra since it decomposes
to U+0DD9,U+0DCA where U+0DD9 is a left matra and U+0DCA is the virama.
We don't want to move the virama with the left matra.
TEST: U+0D9A,U+0DDA

Note that we were already doing this in the Uniscribe bug compatibility
mode.  We now do it all the time.
2012-11-14 11:37:04 -08:00
Behdad Esfahbod 92f9bfed42 Minor 2012-11-13 16:50:45 -08:00
Behdad Esfahbod 66ac2ff32e API change: Remove "mask" from hb_buffer_add()
I don't expect anybody using hb_buffer_add(), so this shouldn't break
anyone's code.
2012-11-13 16:26:32 -08:00
Behdad Esfahbod e13f8d280b Fix UTF-8 backward iteration
Ouch!
2012-11-13 15:12:06 -08:00
Behdad Esfahbod 5669a6cf41 [Arabic] Fix post-context handling
Ouch!
2012-11-13 15:11:51 -08:00
Behdad Esfahbod 0c7df22228 Add buffer flags
New API:

	hb_buffer_flags_t

	HB_BUFFER_FLAGS_DEFAULT
	HB_BUFFER_FLAG_BOT
	HB_BUFFER_FLAG_EOT
	HB_BUFFER_FLAG_PRESERVE_DEFAULT_IGNORABLES

	hb_buffer_set_flags()
	hb_buffer_get_flags()

We use the BOT flag to decide whether to insert dottedcircle if the
first char in the buffer is a combining mark.

The PRESERVE_DEFAULT_IGNORABLES flag prevents removal of characters like
ZWNJ/ZWJ/...
2012-11-13 14:42:35 -08:00
Behdad Esfahbod 1c7e55511a Minor fix
Ouch
2012-11-13 14:42:22 -08:00
Behdad Esfahbod 82ecaff736 Add hb_buffer_clear()
Which is like _reset(), but does NOT clear unicode-funcs.
2012-11-13 14:10:00 -08:00
Behdad Esfahbod 0736915b8e [Indic] Decompose Sinhala split matras the way old HarfBuzz / Pango did
Had to do some refactoring to make this happen...

Under uniscribe bug compatibility mode, we still plit them
Uniscrie-style, but Jonathan and I convinced ourselves that there is no
harm doing this the Unicode way.  This change makes that happen, and
unbreaks free Sinhala fonts.
2012-11-13 12:35:35 -08:00
Behdad Esfahbod 6fd5335622 [Indic] Update auto-generated Indic machine to reflect previous commit 2012-11-12 18:42:18 -08:00
Behdad Esfahbod 9cac1338c4 [Indic] Allow Consonant_Medial's after Consonant's
Mostly affects Myanmar, but also Tai Tham, Javanese, and Cham.  The
latter three are untested (no fonts!).
2012-11-12 18:41:22 -08:00
Behdad Esfahbod d187099cba [Indic] Categorize Myanmar "tone marks" as nuktas 2012-11-12 18:38:06 -08:00
Behdad Esfahbod 8173f23f3f [Indic] Add config for Myanmar 2012-11-12 18:37:20 -08:00
Behdad Esfahbod 9e92978c8a [Indic] Route "new" Myanmar tag through the Indic shaper
Windows 8 adds a Myanmar shaper using the 'mym2' tag.  Route that
through the Indic shaper.  It's still very broken, but at least this
does NOT break old-style Myanmar shaping using the generic shaper.
2012-11-12 18:36:10 -08:00
Behdad Esfahbod 5ab3855f81 Choose shaper based on chosen OT script tag
For Arabic and Indic shapers, if the font doesn't have a script system
for the script, use default shaper.

Make an exception for Arabic script since we have fallback logic for
that one.
2012-11-12 18:27:42 -08:00
Behdad Esfahbod 9b37b4c580 Make planner available to complex shaper choosing logic 2012-11-12 18:23:38 -08:00
Behdad Esfahbod 6fddf2d739 Refactoring ot-map building to make chosen script available earlier 2012-11-12 18:03:07 -08:00
Behdad Esfahbod de796a6fb9 Add "new" Myanmar OT Script tag
Windows 8 added support for Myanmar shaping using the "mym2" script tag,
even though Windows never supported the old "mymr" tag.
2012-11-12 17:27:51 -08:00
Behdad Esfahbod e9334ce97b Break build when ragel is needed and missing 2012-11-12 14:57:02 -08:00
Behdad Esfahbod dba186711e [Indic] Make more room in the table
To be used in upcoming commits.
2012-11-12 14:48:33 -08:00
Behdad Esfahbod c4be991743 Typo 2012-11-12 14:27:33 -08:00
Behdad Esfahbod 56be677781 [Indic] Port 'pref' logic to look into font tables
...instead of using a hardcoded list of Ra characters.
2012-11-12 14:09:40 -08:00
Behdad Esfahbod f2c0f59043 [Indic] Port reph handling logic to look into font features
...instead of using a hardcoded list of Ra characters.
2012-11-12 14:02:02 -08:00
Behdad Esfahbod 43149afbc0 Route MEETEI_MAYEK through the Indic shaper
Since it has a couple of left-"matras".
2012-11-12 13:34:17 -08:00
Behdad Esfahbod d0905c3400 Minor 2012-11-12 13:03:52 -08:00
Behdad Esfahbod 365f27ab5b Work around older compilers
As reported on the list:

I am seeing a similar problem building harfbuzz 0.9.5 with Apple gcc
4.0.1 on OS X 10.5 Leopard:

hb-ot-layout-common-private.hh:406: error: 'struct
OT::CoverageFormat1::Iter' is private
hb-ot-layout-common-private.hh:646: error: within this context
hb-ot-layout-common-private.hh:500: error: 'struct
OT::CoverageFormat2::Iter' is private
hb-ot-layout-common-private.hh:647: error: within this context
make[4]: *** [libharfbuzz_la-hb-ot-layout.lo] Error 1

Also reported as happening with MSVC 2005.
2012-11-12 11:16:57 -08:00
Behdad Esfahbod 6b389ddc36 [Indic] Don't apply 'liga'
Uniscribe doesn't.  And some fonts abuse this feature to get Indic
shaping working in non-complex applications like Adobe's apps.

No change in numbers:

BENGALI: 353897 out of 354188 tests passed. 291 failed (0.0821598%)
DEVANAGARI: 707337 out of 707394 tests passed. 57 failed (0.00805774%)
GUJARATI: 366440 out of 366457 tests passed. 17 failed (0.00463902%)
GURMUKHI: 60704 out of 60747 tests passed. 43 failed (0.0707854%)
KANNADA: 951046 out of 951913 tests passed. 867 failed (0.0910798%)
KHMER: 299074 out of 299124 tests passed. 50 failed (0.0167155%)
LAO: 53611 out of 53644 tests passed. 33 failed (0.0615167%)
MALAYALAM: 1048011 out of 1048334 tests passed. 323 failed (0.0308108%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271666 out of 271847 tests passed. 181 failed (0.0665816%)
TAMIL: 1091754 out of 1091754 tests passed. 0 failed (0%)
TELUGU: 970557 out of 970573 tests passed. 16 failed (0.00164851%)
TIBETAN: 208469 out of 208469 tests passed. 0 failed (0%)
2012-11-12 11:02:56 -08:00
Behdad Esfahbod d05ac7dc3f Fix hb-ft glyph name for broken fonts that return empty glyph names 2012-11-12 10:26:50 -08:00
Behdad Esfahbod 4899801155 U+A872 PHAGS-PA SUPERFIXED LETTER RA is "Right"-Joining 2012-11-08 15:08:26 -08:00
Behdad Esfahbod 22a685836a Adjust Mongolian shaping
For U+1880..U+1886 Uniscribe thinks they are non-joining.
For U+1887 Uniscribe thinks it's joining, but looks wrong to me.
For now, match Uniscribe.
2012-11-05 15:20:10 -08:00
Behdad Esfahbod c26a52fbe6 Minor 2012-11-04 16:48:45 -08:00
Behdad Esfahbod f60d3ed35d Minor 2012-11-04 16:44:47 -08:00
Behdad Esfahbod 10a33296e6 Minor 2012-11-02 13:38:55 -07:00
Behdad Esfahbod 3ba7bc14ea Implement 'Phags-pa shaping
Through the Arabic shaper.  It's similar to Mongolian.
2012-11-01 20:05:04 -07:00
Behdad Esfahbod da70111ab2 Don't clear buffer pre-context if no new context is being provided
Patch from Jonathan Kew.

Part of fixing:

Mozilla Bug 801410 - avoid inserting dotted-circle for run-initial
Unicode combining characters in "simple" scripts such as Latin

https://bugzilla.mozilla.org/show_bug.cgi?id=801410
2012-10-31 13:45:30 -07:00
Behdad Esfahbod 0bc7a38463 [OT] Fix ReverseChainingSubst
We should make it clear that we don't want output buffer in this case,
otherwise buffer->backtrack_len() would be wrong.
2012-10-29 22:02:45 -07:00
Behdad Esfahbod 2616689d15 More tracing fixups 2012-10-29 21:51:56 -07:00
Behdad Esfahbod 937f8d3871 [Arabic] Enable dlig and mset for Arabic
That's what the spec says, and what Uniscribe does.
2012-10-29 21:49:33 -07:00
Behdad Esfahbod bc513add79 Add missing TRACE_RETURN 2012-10-29 19:03:55 -07:00
Behdad Esfahbod 88d3c98e30 [Indic] Position pre-base reordering Ra after Chillus in Malayalam
The logic for pre-base reordering follows the left matra logic.
We had an exception for Malayalam/Tamil in the left matra repositioning
which was not reflected in pre-base reordering.

Malayalam failures down from 337 to 323.

BENGALI: 353996 out of 354285 tests passed. 289 failed (0.0815727%)
DEVANAGARI: 707339 out of 707394 tests passed. 55 failed (0.00777502%)
GUJARATI: 366489 out of 366506 tests passed. 17 failed (0.0046384%)
GURMUKHI: 60769 out of 60809 tests passed. 40 failed (0.0657797%)
KANNADA: 951086 out of 951913 tests passed. 827 failed (0.0868777%)
KHMER: 299106 out of 299124 tests passed. 18 failed (0.00601757%)
LAO: 53611 out of 53644 tests passed. 33 failed (0.0615167%)
MALAYALAM: 1048011 out of 1048334 tests passed. 323 failed (0.0308108%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271726 out of 271847 tests passed. 121 failed (0.0445103%)
TAMIL: 1091837 out of 1091837 tests passed. 0 failed (0%)
TELUGU: 970558 out of 970573 tests passed. 15 failed (0.00154548%)
TIBETAN: 208469 out of 208469 tests passed. 0 failed (0%)
2012-10-29 16:46:44 -07:00
Behdad Esfahbod 21bf796954 Add missed file 2012-10-29 14:21:09 -07:00
Behdad Esfahbod 02ed52169a Improve license information 2012-10-28 21:26:19 -07:00
Behdad Esfahbod 4c1d924461 Minor 2012-10-28 20:27:25 -07:00
Behdad Esfahbod 38b015e57f Fix hb_buffer_set_length(buffer, 0)
Was causing invalid realloc()s.
2012-10-28 20:11:47 -07:00
Behdad Esfahbod b7115b63be Add XXX 2012-10-28 20:11:42 -07:00
Behdad Esfahbod 71ee1f2450 Port to ICU LayoutEngine C API
Incidentally, this makes it not crash with icu-le-hb anymore...
I'm not smart / stupid enough to spend two more days debugging C++
linking issues, and this is ABI-stable at least.
2012-10-28 19:18:11 -07:00
Behdad Esfahbod 0144f05e57 Remove unused members 2012-10-26 13:48:06 -07:00
Behdad Esfahbod cf3afd8979 Rename and revamp is_zero_width() to be is_default_ignorable()
That's really the logic desired.  Except that MONGOLIAN VOWEL SEPARATOR
is not default_ignorable but it really should be.  Reported to Unicode.

Based on suggestion from Konstantin Ritt.
2012-10-25 16:32:54 -07:00
Behdad Esfahbod fecdfa95da Fixup hb_ot_shape_closure()
Broke it when merged cmap mapping and normalizer.  Ouch!
2012-10-07 17:19:58 -04:00
Behdad Esfahbod 2d1dcb3ce3 Mark debug message functions static 2012-10-07 17:13:46 -04:00