Commit Graph

1914 Commits

Author SHA1 Message Date
Behdad Esfahbod d0e68dbd0b [Indic] Implement reph positioning step 5
Not tuned, just copied from step 2.  Fixes another 0.5% of Kannada
failures.  1% to go.
2012-07-20 11:25:41 -04:00
Behdad Esfahbod a9e45c32e4 [Indic] Don't let ZWNJ at the end of syllable affect base search
Fixes a few Devanagari, half of remaining Kannada failures, quarter for
Telugu, and others slightly improved or unchanged.
2012-07-20 11:04:15 -04:00
Behdad Esfahbod 20b68e699f [Indic] Apply 'cjct' globally
Fixes 5 Devanagari failures, and no regressions.
2012-07-20 10:47:46 -04:00
Behdad Esfahbod 51e764de44 [Indic] Unbreak old scriptures
Brings down failures with Lohit-Telugu from 57% to 1.40%.
2012-07-20 10:30:24 -04:00
Behdad Esfahbod 900cf3d449 Minor 2012-07-20 10:18:23 -04:00
Behdad Esfahbod 87cd63266e [Indic] Recategorize some Kannada right matras
Kannada failures down from 3.5% to 2.93%.
2012-07-19 21:25:46 -04:00
Behdad Esfahbod 3604d64ced [Indic] Recategorize GURMUKHI ADDAK
It's not in IndicSyllabicCategory.txt.  Fixes most of Gurmukhi failures.
Failures down from 7.7% to 0.222%!
2012-07-19 21:13:04 -04:00
Behdad Esfahbod 8932858123 Minor 2012-07-19 21:02:38 -04:00
Behdad Esfahbod 47ef931f13 [buffer] Make sure out_info = info during GPOS 2012-07-19 20:52:44 -04:00
Behdad Esfahbod ae63cf2062 Print line number during return when tracing 2012-07-19 20:45:41 -04:00
Behdad Esfahbod 5249f3aee1 [Indic] Unbreak Khmer
For Khmer, all consonants are subjoining.  No need to look in the font.
We were looking in the wrong order anyway.
2012-07-19 20:30:22 -04:00
Behdad Esfahbod e0475345d5 [Indic] Apply 'akhn' globally
Fixes 1.5% more failures for Telugu, 2% for Kannada.
Breaks one test in Devanagari.
2012-07-19 20:24:14 -04:00
Behdad Esfahbod fa247ebe52 [Indic] Better position U+0CD5
Fixes another 5% of Kannada failures.
2012-07-19 19:52:19 -04:00
Behdad Esfahbod f055442716 [Indic] Lookup consonant position in the font
Fixes most failures of Oriya, and improves others a bit.
2012-07-19 16:20:21 -04:00
Behdad Esfahbod 74d1d88781 [GSUB] Fix would_apply() for LigatureSubst 2012-07-19 16:14:23 -04:00
Behdad Esfahbod be73a5f936 Add src/test-would-substitute tool 2012-07-19 15:12:18 -04:00
Behdad Esfahbod e72b360ac6 Refactor / finish would_apply() operation
Untested.
2012-07-19 14:44:46 -04:00
Behdad Esfahbod 8c973ebf0f [Indic] Implement per-script matra positioning
Following what the spec says.

Brings down Telugu failures from 40% to 3.75%, and Kannada failures from
44% to 10%.  Does NOT affect other scripts' test results.
2012-07-19 13:25:08 -04:00
Behdad Esfahbod 8bb32458f9 [Indic] More refactoring 2012-07-19 13:04:44 -04:00
Behdad Esfahbod 9ccc6382ba [Indic] Minor refactoring 2012-07-19 12:45:31 -04:00
Behdad Esfahbod f83aaa3133 [Indic] Minor 2012-07-19 12:23:23 -04:00
Behdad Esfahbod be8b9f5f71 [Indic] Start refactoring different matra positions per script 2012-07-19 12:11:12 -04:00
Behdad Esfahbod b01d9b3d90 [Indic] Disallow decomposition of a couple characters
This is a hack for now.  Will be fixed when we do complex-shaper-driven
normalization properly.

The results with or without decomposition are the same, but Uniscribe
does not normalize, so this matches better.
2012-07-19 11:25:49 -04:00
Behdad Esfahbod 422ecd2d3c [Indic] Accept a forced Rakar sequence at the end of syllable
In Sinhala, Rakar is formed by Al-Lakuna,ZWJ,Ra.  If you put that at the
end of a Consonant,Matra syllable, you get a dotted-circle from
Uniscribe.  Apparently adding a ZWJ before the Al-Lakuna "fixes" that.
And people have been encoding that sequence...  So, allow a forced
"ZWJ,Virama,ZWJ,Ra" sequence at the of syllables.

Fixes some 100 or more of Sinhala failures.  Now at 622 only (0.23%).
2012-07-18 23:25:58 -04:00
Behdad Esfahbod 6fc1732003 [Indic] Allow joiners on both sides of Halant at the same time
The sequence <ZWJ,Al-Lakuna,ZWJ> is used in Sinhala to explicitly ask
for Rakar.  Fixes two-thousand Sinhala tests.  Not many left.
2012-07-18 17:49:19 -04:00
Behdad Esfahbod 10cdc94eee [Indic] In final reordering, find base, even if it disappeared
POS_BASE can disappear if base ligated backward.  Define base as last
with position not after base.

Fixes a few hundred of Sinhala failures with Iskoola Pota.
2012-07-18 17:43:23 -04:00
Behdad Esfahbod 9c4d24a3a6 [Indic] Minor 2012-07-18 17:29:10 -04:00
Behdad Esfahbod 3285e107c9 [Indic] Implement Sinhala "Al Lakuna" Reph behavior
In Sinhala, Reph is formed only explicitly, by the presence of a ZWJ.
2012-07-18 17:22:14 -04:00
Behdad Esfahbod 91cade7555 [Indic/Unicode] Decompose Sinhala split matras the way Uniscribe likes
Makes no visual difference.

Fixes most of the failures.  Down from 15% to 1.3%!
2012-07-18 16:50:41 -04:00
Behdad Esfahbod d8942dcbb4 Apply Tibetan (global) features.
Fixes all Tibetan failures.  All 180k of them!

Merges back Hangul into the default shaper.
2012-07-18 16:34:10 -04:00
Behdad Esfahbod 552d19b7a1 [Indic] Treat Register Shifters like Nukta
Really this time.

Fixes another 18 Khmer tests.
2012-07-18 16:02:33 -04:00
Behdad Esfahbod e8cd81f76d [Indic] Minor 2012-07-18 16:00:20 -04:00
Behdad Esfahbod 69f26bf39c [Indic] Fix Matra reordering when base is at end of syllable
For example: U+915,U+200c,U+93f

Fixes last Tamil failure!
2012-07-18 15:47:51 -04:00
Behdad Esfahbod d16ccc4ae7 Leave one extra item at the end of buffer allocation
Just in case, for the times we do out-of-bounds access.

jk
2012-07-18 15:43:55 -04:00
Behdad Esfahbod 075d671f10 [Indic] Fix out-of-bounds array access 2012-07-18 15:41:53 -04:00
Behdad Esfahbod dcb527242b [Indic] Allow joiners before matras
Fixes 1 more Devanagari test!
2012-07-18 15:32:26 -04:00
Behdad Esfahbod 391cc03317 [Indic] Allow halant group in Vowel and placeholder syllables
Fixes 2 out of 560 Devanagari failures.  AND:
Fixes 1 out of 2 Tamil failures.
2012-07-18 15:12:49 -04:00
Behdad Esfahbod ca4e3d3eab [Indic] Streamline halant/joiner in grammar 2012-07-18 15:05:40 -04:00
Behdad Esfahbod 418d00dffd [Indic] Minor 2012-07-18 14:57:28 -04:00
Behdad Esfahbod 4c3691d2a3 [Indic] Hopefully minor!
Refactoring Indic machin.  No semantic change.
2012-07-18 14:23:55 -04:00
Behdad Esfahbod e092c556fb [Indic] Minor 2012-07-18 14:09:25 -04:00
Behdad Esfahbod 14dbdd9e39 [Indic] Unbreak Tamil
Tamil has only about 150 failures now!
2012-07-18 13:13:03 -04:00
Behdad Esfahbod db8981f1e0 [Indic] Position Khmer Robat
It's a visual Repha.

Still not positioning logical Repha as occurs in Malayalam.

Another 200 Khmer failures fixed.  547 to go.  That's better than
Devanagari!
2012-07-17 23:42:04 -04:00
Behdad Esfahbod 25bc489498 [Indic] Better categorize Register Shifters and Khmer Various signs
Down another 500 or so Khmer failures!
2012-07-17 17:53:03 -04:00
Behdad Esfahbod 39b17837b4 Add hb_buffer_normalize_glyphs() and hb-shape --normalize-glyphs
This reorders glyphs within the cluster to a nominal order.  This should
have no visible effect on the output, but helps with testing, for
getting the same hb-shape output for visually-equal glyphs for each
cluster.
2012-07-17 17:09:29 -04:00
Behdad Esfahbod 25e302da9a [Indic] Minor 2012-07-17 14:25:14 -04:00
Behdad Esfahbod 5d32690a34 [Indic] For scripts without Half forms, always choose first consonant as base
In such scripts (ie. Khmer), a ZWJ/ZWNJ shouldn't stop the search for
base.  So, instead just choose the first consonant as base directly.

Test sequence:
U+1798,200c,U+17C9,U+17D2,U+179B,U+17C1,U+17C7
2012-07-17 14:23:28 -04:00
Behdad Esfahbod 34b5714906 [Indic] Treat Khmer Register Shifters more like Nuktas
Except that there may be a ZWNJ before a Register Shifter.
2012-07-17 14:09:32 -04:00
Behdad Esfahbod 11e2a601b1 [Indic] Minor 2012-07-17 14:02:28 -04:00
Behdad Esfahbod 0201e0a464 [Indic] Apply 'cfar' for Khmer
Mark stuff after a pre-base reordering Ro 'cfar'.  Used in Khmer.
This allows distinguishing the following cases with MS Khmer fonts:

  U+1784,U+17D2,U+179A,U+17D2,U+1782
  U+1784,U+17D2,U+1782,U+17D2,U+179A
2012-07-17 13:56:24 -04:00
Behdad Esfahbod 55f70ebfb9 [Indic] Position final subjoined consonants (and vowels) after matras
In Khmer, a final subjoined consonant or independent vowel can occur
after matras.  This final subjoined thing should NOT be reordered to
before the matra even though it's subjoined.

Fixes another 1k of the Khmer failures.  Not much left really.
2012-07-17 12:50:13 -04:00
Behdad Esfahbod c50ed71e9a [Indic] Recategorize Khmer coeng sign as a separate category OT_Coeng
Amend the syllable structure to allow a final subscripted consonant
(Coeng+C) and a final subscripted independent vowel (Coeng+V).
Fixes another 2k of Khmer failures.
2012-07-17 11:54:28 -04:00
Behdad Esfahbod deb521dee4 [Indic] Add a separate Coeng class
No characters recategorized yet.  No semantic change.
2012-07-17 11:37:32 -04:00
Behdad Esfahbod 74ccc6a132 [Indic] Move Halant with after-base consonants
Normally, we attach the Halant to the previous character and move it
with it.  For after-base consonants however, the Halant "belongs" to the
consonant after, so attach it so.

This fixes Bengali sequences involving post-base consonant Ya, which
should ligate with the Halant to form Ya Phala, but previously a
reordered matras was blocking the ligation.
2012-07-17 11:16:19 -04:00
Behdad Esfahbod d5c4edcdd6 [Indic] Apply presentation-forms features all at once
Seems like this is what Uniscribe is doing, and does not break any fonts
we tested (with Devanagari, Malayalam, Khmer, and Bengali), while fixing
some Ra Phala sequences for Bengali with Vrinda.  Fixes another 2% of
Bengali failures (a couple more to go).
2012-07-17 10:40:59 -04:00
Behdad Esfahbod 559f706678 Fix MarkAttachmentType matching
Fixes issue reported by Khaled Hosny with his Hussaini Nastaleeq font
and sequences like those added in the previous commit.
2012-07-16 22:46:52 -04:00
Behdad Esfahbod ad4494759f Minor 2012-07-16 22:40:21 -04:00
Behdad Esfahbod af92b4cc90 [Indic] Disable 'kern' in Uniscribe bug compatibility mode
Uniscribe does not apply 'kern' in the Indic module.  Some of the Khmer
fonts they ship have small adjustments in the 'kern' table.  Disable
'kern' in the Indic module under Uniscribe bug compatibility mode.

Fixes some 10% of the Khmer failures.  Remains under 3% (excluding
dotted-circle ones).
2012-07-16 20:31:24 -04:00
Behdad Esfahbod d96838ef95 Allow complex shapers overriding common features
In a new callback...  Currently unused by all complex shapers.
2012-07-16 20:26:57 -04:00
Behdad Esfahbod df50b84740 [Indic] Categorize other Khmer marks
Mark them the same as the Register Shifters for now.  Need to rename
that category to something more sensible after all is settled.

Fixes another percent of Khmer failures.  Down to under 3%!
2012-07-16 20:14:50 -04:00
Behdad Esfahbod 8e7b5882fb [Indic] Recognize pre-base reordering Ra anywhere in the syllable
We were doing that only immediately after base.

Fixes another percent in the Khmer failures.  About three more to go...
2012-07-16 17:04:46 -04:00
Behdad Esfahbod 7d09c98a1f [Indic] Recognizer Register Shifter marks
Fixes another 6% of the Khmer failures.
2012-07-16 16:45:22 -04:00
Behdad Esfahbod 60da763dfa [GSUB/GDEF] Guess glyph classes after substitution only if no GDEF
Brings down Khmer failures with Daun Penh font from 36% to 20%.
2012-07-16 16:14:40 -04:00
Behdad Esfahbod fcdc5f1c88 [Indic] Categorize Khmer Ro
Khmer failures down from 58% to 36%.
2012-07-16 15:52:54 -04:00
Behdad Esfahbod 78818124b1 [Indic] Reoder pre-base reordering Ra
Brings down Malayalam failures from 14% down to 3%.
2012-07-16 15:49:08 -04:00
Behdad Esfahbod 1a1dbe9a27 [Indic] Rename 2012-07-16 15:41:33 -04:00
Behdad Esfahbod 46e645ec4b [Indic] Start implementing pre-base reordering 2012-07-16 15:30:05 -04:00
Behdad Esfahbod 921ce5b17d [Indic] Rename
No semantic change.
2012-07-16 15:26:56 -04:00
Behdad Esfahbod b504e060f0 [Indic] Implement After-Main Reph positioning
Almost...
2012-07-16 15:21:12 -04:00
Behdad Esfahbod 17d7de91d7 [Indic] Apply 'pref' to pre-base reodering Ra
No reordering yet.
2012-07-16 15:20:15 -04:00
Behdad Esfahbod 362d3db8d3 [Indic] Minor
Should not be any semantic change.  In preparation for implementing
pre-base reordering Ra.
2012-07-16 15:15:28 -04:00
Behdad Esfahbod 70fe77bb9a Minor 2012-07-16 14:52:18 -04:00
Behdad Esfahbod 2f903215c5 Minor 2012-07-16 13:54:43 -04:00
Behdad Esfahbod a3e04bee2c [Indic] Reorder virama only for old Indic spec 2012-07-16 13:47:19 -04:00
Behdad Esfahbod 0de771b72d [Indic] Categorize Khmer consonants 2012-07-16 13:39:36 -04:00
Behdad Esfahbod d487fff266 Split matras without a Unicode decomposition
This is a hack for now, to get us going with Khmer.  This will be
refactored properly later to move the complex logic into complex
shapers.
2012-07-16 13:25:57 -04:00
Behdad Esfahbod 8aa801a6fd [Indic] Adjust position for split matras
We are going to split matras without a Unicode decompositions in a way
that the second half takes the codepoint of the whole matra.  So,
position them where the second half is supposed to end up.
2012-07-16 13:24:26 -04:00
Behdad Esfahbod 1feb8345a5 [GSUB] Allow 1-to-1 ligature substitutions!
Apparently Uniscribe allows these, and they are used in some Khmer fonts
shipped with Windows, namely, Daun Penh.
2012-07-16 13:23:40 -04:00
Behdad Esfahbod 29f106d7fb [Indic] Apply Above Forms 2012-07-16 12:05:35 -04:00
Behdad Esfahbod fa2bd9fb63 Further simplify atomic ops on Visual Studio 2012-07-14 12:15:54 -04:00
Behdad Esfahbod 0a49235701 Minor 2012-07-13 13:20:49 -04:00
Behdad Esfahbod 11c4ad439e Add -Wcast-align 2012-07-13 11:29:31 -04:00
Behdad Esfahbod a98d0ab186 Make sure HB_BEGIN_DECLS / HB_END_DECLS is only used in public headers
So we can use them to switch default visibility to internal if desired,
and use these to make only declared symbols public.
2012-07-13 10:19:10 -04:00
Behdad Esfahbod 5c5bc96216 Allow overriding HB_BEGIN_DECLS / HB_END_DECLS 2012-07-13 10:15:37 -04:00
Behdad Esfahbod 50a4e78b53 Check for exported weak symbols
Ouch, all our C++ inline functions are being exported (weakly) already.
Fix coming.
2012-07-13 09:48:39 -04:00
Behdad Esfahbod b5aeb95afe Make hb_in_range() static 2012-07-13 09:45:54 -04:00
Behdad Esfahbod 271c8f8907 Minor 2012-07-13 09:32:30 -04:00
Behdad Esfahbod 391f1ff5d8 Fix _InterlockedCompareExchangePointer on x86 2012-07-13 09:04:07 -04:00
Behdad Esfahbod 2023e2b54d [ft] Disable ppem setting
The calculations were wrong.

FreeType makes it really hard to set size and ppem independently.
For now, disable it.  Need to come up with a fix later.
2012-07-11 19:01:26 -04:00
Behdad Esfahbod cdf7444505 [ft] Use unfitted kerning if x_ppem is zero 2012-07-11 18:52:39 -04:00
Behdad Esfahbod 6d08c7f1b3 Revert "Towards templatizing common Lookup types"
This reverts commit 727135f3a9.

This is work-in-progress.  Didn't mean to push it out just yet.
2012-07-11 18:01:27 -04:00
Behdad Esfahbod 552bf3a9f9 Bump WINNT version requested from 500 to 600
Since we use the OpenType versions of Uniscribe functions, we are
relying on that version of the WINNT API.  Otherwise, usp10.h will hide
those symbols.
2012-07-11 18:00:28 -04:00
Behdad Esfahbod 9a5b421a64 Fix build with no Unicode funcs implementations provided 2012-07-11 18:00:28 -04:00
Behdad Esfahbod 727135f3a9 Towards templatizing common Lookup types 2012-07-11 18:00:28 -04:00
Behdad Esfahbod 12f5c0a222 Fix check for Intel atomic ops 2012-06-26 11:16:13 -04:00
Behdad Esfahbod 6932a41fb6 Use octal-escaped UTF-8 characters instead of plain text
https://bugs.freedesktop.org/show_bug.cgi?id=50970
2012-06-26 10:46:31 -04:00
Behdad Esfahbod 8c0ea7bcb4 Disable introspection again
Until I figure out the build issues.  Sigh...
2012-06-24 13:20:56 -04:00
Behdad Esfahbod 49f8e0cd9a GStaticMutex is deprecated 2012-06-16 15:40:03 -04:00
Behdad Esfahbod 1bc1cb3603 Make source more digestable for gobject-introspection 2012-06-16 15:21:55 -04:00
Behdad Esfahbod 84d781e54c Flesh out gobject-introspection stuff a bit 2012-06-16 15:21:41 -04:00