Commit Graph

2137 Commits

Author SHA1 Message Date
Behdad Esfahbod 42848453bf [Thai] Reorder U+0E3A THAI VOWEL SIGN PHINTHU
Uniscribe reorders U+0E3A to be after U+0E38 and U+0E39.  We do that by
modifying the ccc for U+0E3A.

Fixes the two remaining Thai failures (see previous commit).
2012-07-23 13:52:07 -04:00
Behdad Esfahbod 4a7f4f3e56 [Thai] Adjust SARA AM reordering to match Uniscribe
Adjust the list of marks before SARA AM that get the reordering
treatment.  Also adjust cluster formation to match Uniscribe.

With Wikipedia test data, now I see:

  - For Thai, with the Angsana New font from Win7, I see 54 failures out
    of over 4M tests  (0.00129107%).  Of the 54, two are legitimate
    reordering issues (fix coming soon), and the other 52 are simply
    Uniscribe using a zero-width space char instead of an unknown
    character for missing glyphs.  No idea why.  The missing-glyph
    sequences include one that is a Thai character followed by an Arabic
    Sokun.  Someone confused it with Nikhahit I assume!

  - For Lao, with the Dokchampa font from Win7, 33 tests fail out of
    54k (0.0615167%).  All seem to be insignificant mark positioning
    with two marks on a base.  Have to investigate.
2012-07-23 13:15:33 -04:00
Behdad Esfahbod 2cc933aff9 [Indic] Fix cluster formation with left-matras and conjunct forms
Test case was: <U+0D15,U+0D4D,U+0D15,U+0D4A>.
2012-07-23 08:23:44 -04:00
Behdad Esfahbod e6b01a878c [Indic] Further streamline cluster formation
This should address all possible cluster misformations that I had in
mind.
2012-07-23 00:11:26 -04:00
Behdad Esfahbod 7b2a7dadd6 [Indic] Merge clusters before sorting
This should fix any instabilities in cluster formation that we were
speculating may happen with surrounding syllables.  Or most of it
perhaps.
2012-07-22 23:58:55 -04:00
Behdad Esfahbod abb3239ef9 [Indic] Update clusters for left-matra even if matra didn't move
Fixes crashes reported with left matra under
non-uniscribe-bug-compatibilty mode.
2012-07-22 23:55:19 -04:00
Behdad Esfahbod 92a1ad7bef [Indic] Stop searching for base if a post form is found before below form
Improves Bengali and Gurmukhi.  Malayalam regressed a bit.  We will deal
with that later.
2012-07-20 18:55:15 -04:00
Behdad Esfahbod 4c450c703f [Indic] Recompose Bengali Ya,Nukta
This is a bunch of hacks for now.

Improves Bengali a bit.
2012-07-20 18:13:04 -04:00
Behdad Esfahbod e9c0f152a3 [Uniscribe] Fix script fallback
Gurmukhi failures half now.  Others changed slightly.
2012-07-20 17:37:48 -04:00
Behdad Esfahbod 5791f32915 [Indic] Allow a ZWNJ after SM's
Malayalam failures go way down.  Other scripts benefitted slightly too.
Sinhala had one or two test regressions, but...
2012-07-20 16:26:55 -04:00
Behdad Esfahbod 34ae336f3f [Indic] Improve Reph AfterMain positioning
Fixes 20 out of 48 failing Oriya tests.  Failure rate down to 0.066% now.
2012-07-20 16:17:28 -04:00
Behdad Esfahbod bdd080431a [Indic] Reposition Oriya Candrabindu
Oriya failures down from 0.65% to 0.20%.
2012-07-20 16:03:09 -04:00
Behdad Esfahbod 5f0eaaad12 [Indic] Fix base search in final_reordering
Fixes most Malayalam failures.  Down from 1.6% to 0.38% now.  Fixes a
few more in other scripts too.
2012-07-20 15:47:24 -04:00
Behdad Esfahbod 81202bd860 [Indic] Don't attach SM/VD to other characters 2012-07-20 15:14:51 -04:00
Behdad Esfahbod efb4ad7356 Fix compiler warnings
If x is not constant, we cannot ASSERT_STATIC on it.
2012-07-20 14:27:38 -04:00
Behdad Esfahbod f31d97e44e [Indic] Form Telugu Reph out of Ra,Virama,ZWJ
Apparently this was approved in Feb 2012.  No font yet.
2012-07-20 14:13:35 -04:00
Behdad Esfahbod 2e193b240e [Indic] Don't split U+0AC9
Althought IndicMatraCategory.txt classifies it as Top_And_Right matra,
it does not have Unicode decomposition, and Uniscribe does not do
anything special about it either.

Gujarati failures down from 0.672% to 0.0130966%.
2012-07-20 14:02:35 -04:00
Behdad Esfahbod 30c3d5e9fc [Indic] Simplify Uniscribe cluster emulation
Now that we break syllables on Halant,ZWNJ, this code can be simplified.
2012-07-20 13:56:32 -04:00
Behdad Esfahbod decf6ffca4 [Indic] Minor! 2012-07-20 13:51:31 -04:00
Behdad Esfahbod 9e4f94a72c [Indic] Break syllables at Halant,ZWNJ
That's really what Uniscribe does, and explains a lot of pecularities of
Halant,ZWNJ before the base.

Sent Telugu from 1% failures to 0.03%.  Improved Kannada and Malayalam
slightly.  Fixed half of Bengali, and did NOT break anything!
2012-07-20 13:48:03 -04:00
Behdad Esfahbod 2c372b80f6 [Indic] Better check for applying 'init'
Specifically, don't apply 'init' if previous char is a joiner.

Fixes some more of Bengali.
2012-07-20 13:37:48 -04:00
Behdad Esfahbod 34a7440b7c [GPOS] Don't zero mark advances
Fixes more of Telugu, Kannada, and Oriya.

May break things (outside Indic...), but we cannot think of any font relying
on this immediately.
2012-07-20 12:40:39 -04:00
Behdad Esfahbod 8ed248de77 [Indic] Minor 2012-07-20 11:42:24 -04:00
Behdad Esfahbod d0e68dbd0b [Indic] Implement reph positioning step 5
Not tuned, just copied from step 2.  Fixes another 0.5% of Kannada
failures.  1% to go.
2012-07-20 11:25:41 -04:00
Behdad Esfahbod a9e45c32e4 [Indic] Don't let ZWNJ at the end of syllable affect base search
Fixes a few Devanagari, half of remaining Kannada failures, quarter for
Telugu, and others slightly improved or unchanged.
2012-07-20 11:04:15 -04:00
Behdad Esfahbod 20b68e699f [Indic] Apply 'cjct' globally
Fixes 5 Devanagari failures, and no regressions.
2012-07-20 10:47:46 -04:00
Behdad Esfahbod 51e764de44 [Indic] Unbreak old scriptures
Brings down failures with Lohit-Telugu from 57% to 1.40%.
2012-07-20 10:30:24 -04:00
Behdad Esfahbod 900cf3d449 Minor 2012-07-20 10:18:23 -04:00
Behdad Esfahbod 87cd63266e [Indic] Recategorize some Kannada right matras
Kannada failures down from 3.5% to 2.93%.
2012-07-19 21:25:46 -04:00
Behdad Esfahbod 3604d64ced [Indic] Recategorize GURMUKHI ADDAK
It's not in IndicSyllabicCategory.txt.  Fixes most of Gurmukhi failures.
Failures down from 7.7% to 0.222%!
2012-07-19 21:13:04 -04:00
Behdad Esfahbod 8932858123 Minor 2012-07-19 21:02:38 -04:00
Behdad Esfahbod 47ef931f13 [buffer] Make sure out_info = info during GPOS 2012-07-19 20:52:44 -04:00
Behdad Esfahbod ae63cf2062 Print line number during return when tracing 2012-07-19 20:45:41 -04:00
Behdad Esfahbod 5249f3aee1 [Indic] Unbreak Khmer
For Khmer, all consonants are subjoining.  No need to look in the font.
We were looking in the wrong order anyway.
2012-07-19 20:30:22 -04:00
Behdad Esfahbod e0475345d5 [Indic] Apply 'akhn' globally
Fixes 1.5% more failures for Telugu, 2% for Kannada.
Breaks one test in Devanagari.
2012-07-19 20:24:14 -04:00
Behdad Esfahbod fa247ebe52 [Indic] Better position U+0CD5
Fixes another 5% of Kannada failures.
2012-07-19 19:52:19 -04:00
Behdad Esfahbod f055442716 [Indic] Lookup consonant position in the font
Fixes most failures of Oriya, and improves others a bit.
2012-07-19 16:20:21 -04:00
Behdad Esfahbod 74d1d88781 [GSUB] Fix would_apply() for LigatureSubst 2012-07-19 16:14:23 -04:00
Behdad Esfahbod be73a5f936 Add src/test-would-substitute tool 2012-07-19 15:12:18 -04:00
Behdad Esfahbod e72b360ac6 Refactor / finish would_apply() operation
Untested.
2012-07-19 14:44:46 -04:00
Behdad Esfahbod 8c973ebf0f [Indic] Implement per-script matra positioning
Following what the spec says.

Brings down Telugu failures from 40% to 3.75%, and Kannada failures from
44% to 10%.  Does NOT affect other scripts' test results.
2012-07-19 13:25:08 -04:00
Behdad Esfahbod 8bb32458f9 [Indic] More refactoring 2012-07-19 13:04:44 -04:00
Behdad Esfahbod 9ccc6382ba [Indic] Minor refactoring 2012-07-19 12:45:31 -04:00
Behdad Esfahbod f83aaa3133 [Indic] Minor 2012-07-19 12:23:23 -04:00
Behdad Esfahbod be8b9f5f71 [Indic] Start refactoring different matra positions per script 2012-07-19 12:11:12 -04:00
Behdad Esfahbod b01d9b3d90 [Indic] Disallow decomposition of a couple characters
This is a hack for now.  Will be fixed when we do complex-shaper-driven
normalization properly.

The results with or without decomposition are the same, but Uniscribe
does not normalize, so this matches better.
2012-07-19 11:25:49 -04:00
Behdad Esfahbod 422ecd2d3c [Indic] Accept a forced Rakar sequence at the end of syllable
In Sinhala, Rakar is formed by Al-Lakuna,ZWJ,Ra.  If you put that at the
end of a Consonant,Matra syllable, you get a dotted-circle from
Uniscribe.  Apparently adding a ZWJ before the Al-Lakuna "fixes" that.
And people have been encoding that sequence...  So, allow a forced
"ZWJ,Virama,ZWJ,Ra" sequence at the of syllables.

Fixes some 100 or more of Sinhala failures.  Now at 622 only (0.23%).
2012-07-18 23:25:58 -04:00
Behdad Esfahbod 6fc1732003 [Indic] Allow joiners on both sides of Halant at the same time
The sequence <ZWJ,Al-Lakuna,ZWJ> is used in Sinhala to explicitly ask
for Rakar.  Fixes two-thousand Sinhala tests.  Not many left.
2012-07-18 17:49:19 -04:00
Behdad Esfahbod 10cdc94eee [Indic] In final reordering, find base, even if it disappeared
POS_BASE can disappear if base ligated backward.  Define base as last
with position not after base.

Fixes a few hundred of Sinhala failures with Iskoola Pota.
2012-07-18 17:43:23 -04:00
Behdad Esfahbod 9c4d24a3a6 [Indic] Minor 2012-07-18 17:29:10 -04:00
Behdad Esfahbod 3285e107c9 [Indic] Implement Sinhala "Al Lakuna" Reph behavior
In Sinhala, Reph is formed only explicitly, by the presence of a ZWJ.
2012-07-18 17:22:14 -04:00
Behdad Esfahbod 91cade7555 [Indic/Unicode] Decompose Sinhala split matras the way Uniscribe likes
Makes no visual difference.

Fixes most of the failures.  Down from 15% to 1.3%!
2012-07-18 16:50:41 -04:00
Behdad Esfahbod d8942dcbb4 Apply Tibetan (global) features.
Fixes all Tibetan failures.  All 180k of them!

Merges back Hangul into the default shaper.
2012-07-18 16:34:10 -04:00
Behdad Esfahbod 552d19b7a1 [Indic] Treat Register Shifters like Nukta
Really this time.

Fixes another 18 Khmer tests.
2012-07-18 16:02:33 -04:00
Behdad Esfahbod e8cd81f76d [Indic] Minor 2012-07-18 16:00:20 -04:00
Behdad Esfahbod 69f26bf39c [Indic] Fix Matra reordering when base is at end of syllable
For example: U+915,U+200c,U+93f

Fixes last Tamil failure!
2012-07-18 15:47:51 -04:00
Behdad Esfahbod d16ccc4ae7 Leave one extra item at the end of buffer allocation
Just in case, for the times we do out-of-bounds access.

jk
2012-07-18 15:43:55 -04:00
Behdad Esfahbod 075d671f10 [Indic] Fix out-of-bounds array access 2012-07-18 15:41:53 -04:00
Behdad Esfahbod dcb527242b [Indic] Allow joiners before matras
Fixes 1 more Devanagari test!
2012-07-18 15:32:26 -04:00
Behdad Esfahbod 391cc03317 [Indic] Allow halant group in Vowel and placeholder syllables
Fixes 2 out of 560 Devanagari failures.  AND:
Fixes 1 out of 2 Tamil failures.
2012-07-18 15:12:49 -04:00
Behdad Esfahbod ca4e3d3eab [Indic] Streamline halant/joiner in grammar 2012-07-18 15:05:40 -04:00
Behdad Esfahbod 418d00dffd [Indic] Minor 2012-07-18 14:57:28 -04:00
Behdad Esfahbod 4c3691d2a3 [Indic] Hopefully minor!
Refactoring Indic machin.  No semantic change.
2012-07-18 14:23:55 -04:00
Behdad Esfahbod e092c556fb [Indic] Minor 2012-07-18 14:09:25 -04:00
Behdad Esfahbod 14dbdd9e39 [Indic] Unbreak Tamil
Tamil has only about 150 failures now!
2012-07-18 13:13:03 -04:00
Behdad Esfahbod db8981f1e0 [Indic] Position Khmer Robat
It's a visual Repha.

Still not positioning logical Repha as occurs in Malayalam.

Another 200 Khmer failures fixed.  547 to go.  That's better than
Devanagari!
2012-07-17 23:42:04 -04:00
Behdad Esfahbod 25bc489498 [Indic] Better categorize Register Shifters and Khmer Various signs
Down another 500 or so Khmer failures!
2012-07-17 17:53:03 -04:00
Behdad Esfahbod 39b17837b4 Add hb_buffer_normalize_glyphs() and hb-shape --normalize-glyphs
This reorders glyphs within the cluster to a nominal order.  This should
have no visible effect on the output, but helps with testing, for
getting the same hb-shape output for visually-equal glyphs for each
cluster.
2012-07-17 17:09:29 -04:00
Behdad Esfahbod 25e302da9a [Indic] Minor 2012-07-17 14:25:14 -04:00
Behdad Esfahbod 5d32690a34 [Indic] For scripts without Half forms, always choose first consonant as base
In such scripts (ie. Khmer), a ZWJ/ZWNJ shouldn't stop the search for
base.  So, instead just choose the first consonant as base directly.

Test sequence:
U+1798,200c,U+17C9,U+17D2,U+179B,U+17C1,U+17C7
2012-07-17 14:23:28 -04:00
Behdad Esfahbod 34b5714906 [Indic] Treat Khmer Register Shifters more like Nuktas
Except that there may be a ZWNJ before a Register Shifter.
2012-07-17 14:09:32 -04:00
Behdad Esfahbod 11e2a601b1 [Indic] Minor 2012-07-17 14:02:28 -04:00
Behdad Esfahbod 0201e0a464 [Indic] Apply 'cfar' for Khmer
Mark stuff after a pre-base reordering Ro 'cfar'.  Used in Khmer.
This allows distinguishing the following cases with MS Khmer fonts:

  U+1784,U+17D2,U+179A,U+17D2,U+1782
  U+1784,U+17D2,U+1782,U+17D2,U+179A
2012-07-17 13:56:24 -04:00
Behdad Esfahbod 55f70ebfb9 [Indic] Position final subjoined consonants (and vowels) after matras
In Khmer, a final subjoined consonant or independent vowel can occur
after matras.  This final subjoined thing should NOT be reordered to
before the matra even though it's subjoined.

Fixes another 1k of the Khmer failures.  Not much left really.
2012-07-17 12:50:13 -04:00
Behdad Esfahbod c50ed71e9a [Indic] Recategorize Khmer coeng sign as a separate category OT_Coeng
Amend the syllable structure to allow a final subscripted consonant
(Coeng+C) and a final subscripted independent vowel (Coeng+V).
Fixes another 2k of Khmer failures.
2012-07-17 11:54:28 -04:00
Behdad Esfahbod deb521dee4 [Indic] Add a separate Coeng class
No characters recategorized yet.  No semantic change.
2012-07-17 11:37:32 -04:00
Behdad Esfahbod 74ccc6a132 [Indic] Move Halant with after-base consonants
Normally, we attach the Halant to the previous character and move it
with it.  For after-base consonants however, the Halant "belongs" to the
consonant after, so attach it so.

This fixes Bengali sequences involving post-base consonant Ya, which
should ligate with the Halant to form Ya Phala, but previously a
reordered matras was blocking the ligation.
2012-07-17 11:16:19 -04:00
Behdad Esfahbod d5c4edcdd6 [Indic] Apply presentation-forms features all at once
Seems like this is what Uniscribe is doing, and does not break any fonts
we tested (with Devanagari, Malayalam, Khmer, and Bengali), while fixing
some Ra Phala sequences for Bengali with Vrinda.  Fixes another 2% of
Bengali failures (a couple more to go).
2012-07-17 10:40:59 -04:00
Behdad Esfahbod 559f706678 Fix MarkAttachmentType matching
Fixes issue reported by Khaled Hosny with his Hussaini Nastaleeq font
and sequences like those added in the previous commit.
2012-07-16 22:46:52 -04:00
Behdad Esfahbod ad4494759f Minor 2012-07-16 22:40:21 -04:00
Behdad Esfahbod af92b4cc90 [Indic] Disable 'kern' in Uniscribe bug compatibility mode
Uniscribe does not apply 'kern' in the Indic module.  Some of the Khmer
fonts they ship have small adjustments in the 'kern' table.  Disable
'kern' in the Indic module under Uniscribe bug compatibility mode.

Fixes some 10% of the Khmer failures.  Remains under 3% (excluding
dotted-circle ones).
2012-07-16 20:31:24 -04:00
Behdad Esfahbod d96838ef95 Allow complex shapers overriding common features
In a new callback...  Currently unused by all complex shapers.
2012-07-16 20:26:57 -04:00
Behdad Esfahbod df50b84740 [Indic] Categorize other Khmer marks
Mark them the same as the Register Shifters for now.  Need to rename
that category to something more sensible after all is settled.

Fixes another percent of Khmer failures.  Down to under 3%!
2012-07-16 20:14:50 -04:00
Behdad Esfahbod 8e7b5882fb [Indic] Recognize pre-base reordering Ra anywhere in the syllable
We were doing that only immediately after base.

Fixes another percent in the Khmer failures.  About three more to go...
2012-07-16 17:04:46 -04:00
Behdad Esfahbod 7d09c98a1f [Indic] Recognizer Register Shifter marks
Fixes another 6% of the Khmer failures.
2012-07-16 16:45:22 -04:00
Behdad Esfahbod 60da763dfa [GSUB/GDEF] Guess glyph classes after substitution only if no GDEF
Brings down Khmer failures with Daun Penh font from 36% to 20%.
2012-07-16 16:14:40 -04:00
Behdad Esfahbod fcdc5f1c88 [Indic] Categorize Khmer Ro
Khmer failures down from 58% to 36%.
2012-07-16 15:52:54 -04:00
Behdad Esfahbod 78818124b1 [Indic] Reoder pre-base reordering Ra
Brings down Malayalam failures from 14% down to 3%.
2012-07-16 15:49:08 -04:00
Behdad Esfahbod 1a1dbe9a27 [Indic] Rename 2012-07-16 15:41:33 -04:00
Behdad Esfahbod 46e645ec4b [Indic] Start implementing pre-base reordering 2012-07-16 15:30:05 -04:00
Behdad Esfahbod 921ce5b17d [Indic] Rename
No semantic change.
2012-07-16 15:26:56 -04:00
Behdad Esfahbod b504e060f0 [Indic] Implement After-Main Reph positioning
Almost...
2012-07-16 15:21:12 -04:00
Behdad Esfahbod 17d7de91d7 [Indic] Apply 'pref' to pre-base reodering Ra
No reordering yet.
2012-07-16 15:20:15 -04:00
Behdad Esfahbod 362d3db8d3 [Indic] Minor
Should not be any semantic change.  In preparation for implementing
pre-base reordering Ra.
2012-07-16 15:15:28 -04:00
Behdad Esfahbod 70fe77bb9a Minor 2012-07-16 14:52:18 -04:00
Behdad Esfahbod 2f903215c5 Minor 2012-07-16 13:54:43 -04:00
Behdad Esfahbod a3e04bee2c [Indic] Reorder virama only for old Indic spec 2012-07-16 13:47:19 -04:00
Behdad Esfahbod 0de771b72d [Indic] Categorize Khmer consonants 2012-07-16 13:39:36 -04:00
Behdad Esfahbod d487fff266 Split matras without a Unicode decomposition
This is a hack for now, to get us going with Khmer.  This will be
refactored properly later to move the complex logic into complex
shapers.
2012-07-16 13:25:57 -04:00
Behdad Esfahbod 8aa801a6fd [Indic] Adjust position for split matras
We are going to split matras without a Unicode decompositions in a way
that the second half takes the codepoint of the whole matra.  So,
position them where the second half is supposed to end up.
2012-07-16 13:24:26 -04:00
Behdad Esfahbod 1feb8345a5 [GSUB] Allow 1-to-1 ligature substitutions!
Apparently Uniscribe allows these, and they are used in some Khmer fonts
shipped with Windows, namely, Daun Penh.
2012-07-16 13:23:40 -04:00
Behdad Esfahbod 29f106d7fb [Indic] Apply Above Forms 2012-07-16 12:05:35 -04:00
Behdad Esfahbod fa2bd9fb63 Further simplify atomic ops on Visual Studio 2012-07-14 12:15:54 -04:00
Behdad Esfahbod 0a49235701 Minor 2012-07-13 13:20:49 -04:00
Behdad Esfahbod 11c4ad439e Add -Wcast-align 2012-07-13 11:29:31 -04:00
Behdad Esfahbod a98d0ab186 Make sure HB_BEGIN_DECLS / HB_END_DECLS is only used in public headers
So we can use them to switch default visibility to internal if desired,
and use these to make only declared symbols public.
2012-07-13 10:19:10 -04:00
Behdad Esfahbod 5c5bc96216 Allow overriding HB_BEGIN_DECLS / HB_END_DECLS 2012-07-13 10:15:37 -04:00
Behdad Esfahbod 50a4e78b53 Check for exported weak symbols
Ouch, all our C++ inline functions are being exported (weakly) already.
Fix coming.
2012-07-13 09:48:39 -04:00
Behdad Esfahbod b5aeb95afe Make hb_in_range() static 2012-07-13 09:45:54 -04:00
Behdad Esfahbod 271c8f8907 Minor 2012-07-13 09:32:30 -04:00
Behdad Esfahbod 391f1ff5d8 Fix _InterlockedCompareExchangePointer on x86 2012-07-13 09:04:07 -04:00
Behdad Esfahbod 2023e2b54d [ft] Disable ppem setting
The calculations were wrong.

FreeType makes it really hard to set size and ppem independently.
For now, disable it.  Need to come up with a fix later.
2012-07-11 19:01:26 -04:00
Behdad Esfahbod cdf7444505 [ft] Use unfitted kerning if x_ppem is zero 2012-07-11 18:52:39 -04:00
Behdad Esfahbod 6d08c7f1b3 Revert "Towards templatizing common Lookup types"
This reverts commit 727135f3a9.

This is work-in-progress.  Didn't mean to push it out just yet.
2012-07-11 18:01:27 -04:00
Behdad Esfahbod 552bf3a9f9 Bump WINNT version requested from 500 to 600
Since we use the OpenType versions of Uniscribe functions, we are
relying on that version of the WINNT API.  Otherwise, usp10.h will hide
those symbols.
2012-07-11 18:00:28 -04:00
Behdad Esfahbod 9a5b421a64 Fix build with no Unicode funcs implementations provided 2012-07-11 18:00:28 -04:00
Behdad Esfahbod 727135f3a9 Towards templatizing common Lookup types 2012-07-11 18:00:28 -04:00
Behdad Esfahbod 12f5c0a222 Fix check for Intel atomic ops 2012-06-26 11:16:13 -04:00
Behdad Esfahbod 6932a41fb6 Use octal-escaped UTF-8 characters instead of plain text
https://bugs.freedesktop.org/show_bug.cgi?id=50970
2012-06-26 10:46:31 -04:00
Behdad Esfahbod 8c0ea7bcb4 Disable introspection again
Until I figure out the build issues.  Sigh...
2012-06-24 13:20:56 -04:00
Behdad Esfahbod 49f8e0cd9a GStaticMutex is deprecated 2012-06-16 15:40:03 -04:00
Behdad Esfahbod 1bc1cb3603 Make source more digestable for gobject-introspection 2012-06-16 15:21:55 -04:00
Behdad Esfahbod 84d781e54c Flesh out gobject-introspection stuff a bit 2012-06-16 15:21:41 -04:00
Behdad Esfahbod 2cf301968c Add hb_object_lock/unlock() 2012-06-09 14:58:01 -04:00
Behdad Esfahbod f211d5c291 More Oops! Fix fast-path with sub-type==0 2012-06-09 03:11:22 -04:00
Behdad Esfahbod b1de6aa1f3 Oops! 2012-06-09 03:07:59 -04:00
Behdad Esfahbod b12e2549cb Minor 2012-06-09 03:05:20 -04:00
Behdad Esfahbod faf0f20253 Add sanitize() logic for fast-paths 2012-06-09 03:02:36 -04:00
Behdad Esfahbod 4e766ff28d Add fast-path for GPOS too
Shaves another 3% for DejaVu Sans long Latin strings.
2012-06-09 02:53:57 -04:00
Behdad Esfahbod 993c51915f Add fast-path to GSUB to check coverage
Shaves a good 10% off DejaVu Sans with simple Latin text for me.
Now, DejaVu is very ChainContext-intensive, but it's also a very
popular font!
2012-06-09 02:48:16 -04:00
Behdad Esfahbod f19e0b0099 Match input before backtrack
Makes more sense, optimization-wise.
2012-06-09 02:26:57 -04:00
Behdad Esfahbod 67bb9e8cea Add set add_coverage() to Coverage() 2012-06-09 02:02:46 -04:00
Behdad Esfahbod 4952f0aa5b Minor 2012-06-09 01:43:20 -04:00
Behdad Esfahbod ad6a6f2240 Minor 2012-06-09 01:43:20 -04:00
Behdad Esfahbod 46617a4213 Fix cache implementation 2012-06-09 01:43:20 -04:00
Behdad Esfahbod ce47613889 Micro-optimize
I know...
2012-06-09 01:43:15 -04:00
Behdad Esfahbod 70416de298 Minor 2012-06-09 00:56:41 -04:00
Behdad Esfahbod 99159e52a3 Use linear search for small counts
I see about 8% speedup with long strings with DejaVu Sans.
2012-06-09 00:50:40 -04:00
Behdad Esfahbod caf0412690 Minor 2012-06-09 00:26:32 -04:00
Behdad Esfahbod 0f8fea71a6 Minor. Hide _hb_ot_layout_get_glyph_property() 2012-06-09 00:24:38 -04:00
Behdad Esfahbod 44b8ee0c90 Minor 2012-06-09 00:23:24 -04:00
Behdad Esfahbod 7b84c536c1 In MarkBase attachment, only attach to first of a MultipleSubst sequence
This is apparently what Uniscribe does.  Test case is:

  SEEN FATHA TEH ALEF

with Arabic Typesetting.  Originally reported by Khaled Hosny.
2012-06-08 22:04:23 -04:00
Behdad Esfahbod ec57e0c565 Set lig_comp for MultipleSubst components
To be used for correct mark attachment to first component of a
MultipleSubst output.  That's what Uniscribe does.
2012-06-08 21:47:23 -04:00
Behdad Esfahbod e085fcf7ca Remove unused buffer->replace_glyphs_be16 2012-06-08 21:45:00 -04:00
Behdad Esfahbod 3ec77d6ae0 Don't use replace_glyphs_be for MultipleSubst 2012-06-08 21:44:06 -04:00
Behdad Esfahbod 4b7192125f Minor 2012-06-08 21:41:46 -04:00
Behdad Esfahbod 4508789f4b Add test for static initializers and other C++ stuff 2012-06-08 21:32:43 -04:00
Behdad Esfahbod 56bd259b9a Minor 2012-06-08 21:29:18 -04:00
Behdad Esfahbod bc8357ea7b Merge clusters during normalization 2012-06-08 21:01:20 -04:00
Behdad Esfahbod fe3dabc08d Minor 2012-06-08 20:56:05 -04:00
Behdad Esfahbod e88e14421a Use merge_clusters instead of open-coding 2012-06-08 20:55:21 -04:00
Behdad Esfahbod 330a2af3ff Use merge_clusters when forming Unicode clusters 2012-06-08 20:40:02 -04:00
Behdad Esfahbod bd300df9ad Minor 2012-06-08 20:36:37 -04:00
Behdad Esfahbod e51d2b6ed1 Extend into main buffer if extension hit end of out-buffer merging clusters 2012-06-08 20:36:33 -04:00
Behdad Esfahbod 5ced012d9f Extend end when merging clusters in out-buffer 2012-06-08 20:31:32 -04:00
Behdad Esfahbod 72c0a18783 Extend clusters backward in out-buffer 2012-06-08 20:30:03 -04:00
Behdad Esfahbod cd5891493d Extend clusters backwards, into the out-buffer too 2012-06-08 20:28:59 -04:00
Behdad Esfahbod 77471e0371 Clear output buffer before calling GSUB pause functions 2012-06-08 20:21:02 -04:00
Behdad Esfahbod cafa6f3727 When merging clusters, extend the end 2012-06-08 20:17:10 -04:00
Behdad Esfahbod 28ce5fa454 Merge clusters when ligating 2012-06-08 20:17:06 -04:00
Behdad Esfahbod 2bb1761ccb Minor, use next_glyph() 2012-06-08 19:29:44 -04:00
Behdad Esfahbod 5f68f8675e Minor 2012-06-08 19:23:43 -04:00
Behdad Esfahbod 8729691267 Increase Uniscribe MAX_ITEMS 2012-06-08 14:39:31 -04:00
Behdad Esfahbod dbffa4c83d Fix Uniscribe charset matching
Previously was failing to match fonts that didn't support CHARSET_ANSI.

There still remains a problem with the Uniscribe backend, in that if a
font with the same family name is installed, and is newer, the native
one is preferred over the font we provide.  Fixing it requires rewriting
the name table with a unique family name...
2012-06-08 14:39:31 -04:00
Behdad Esfahbod 82e8bd8628 Remove unused code 2012-06-08 14:39:31 -04:00
Behdad Esfahbod 6da9dbff21 Remove zero-width chars in the fallback shaper too 2012-06-08 10:53:35 -04:00
Behdad Esfahbod 68b76121f8 Fix regressions introduced by sed. Ouch!
Introduced in 99c2695759.
Broken mark-mark and mark-ligature stuff.
2012-06-08 10:47:00 -04:00
Behdad Esfahbod 0dd86f9f68 Whitespace 2012-06-08 10:23:03 -04:00
Behdad Esfahbod 8e7beba7c3 Fix Uniscribe clusters with direction-overriden Arabic 2012-06-08 10:22:06 -04:00
Behdad Esfahbod b069c3c31b Really fix override-direction in Uniscribe 2012-06-08 10:10:29 -04:00
Behdad Esfahbod fcd6f53261 Unbreak Uniscribe
Oops.  hb_tag_t and OPENTYPE_TAG have different endianness.  Perhaps
something to add API for in hb-uniscribe.h
2012-06-08 09:59:43 -04:00
Behdad Esfahbod 29eac8f591 Override direction in Uniscribe backend
Matches OT backend now.
2012-06-08 09:26:17 -04:00
Behdad Esfahbod 1c1233e576 Make Uniscribe backend respect selected script 2012-06-08 09:20:53 -04:00
Behdad Esfahbod 0bb0f5d419 Add note re _NullPool 2012-06-07 17:42:48 -04:00
Behdad Esfahbod 2a3d911fe0 Fix alignment-requirement missmatch
Detected by clang and lots of cmdline options.
2012-06-07 17:31:46 -04:00
Behdad Esfahbod 6095de1635 Fix clang warning with NO_MT path 2012-06-07 15:48:18 -04:00
Behdad Esfahbod a18280a8ce Fix warnings produced by clang analyzer 2012-06-07 15:44:12 -04:00
Behdad Esfahbod 73cb02de2d Minor 2012-06-06 11:29:25 -04:00
Behdad Esfahbod 79e2b4791f Fix ASSERT_POD on clang
As reported by bashi.  Not tested.
2012-06-06 11:27:17 -04:00
Behdad Esfahbod 6220e5fc0d Add ASSERT_POD for most objects 2012-06-06 03:30:09 -04:00
Behdad Esfahbod a00a63b5ef Add macros to check that types are POD 2012-06-06 03:07:01 -04:00
Behdad Esfahbod 61eb60c129 Don't link to libstdc++
New try.
2012-06-05 21:22:36 -04:00
Behdad Esfahbod 81a4b9fd4e Remove unused hb_static_mutex_t 2012-06-05 20:53:00 -04:00
Behdad Esfahbod 4a3a9897b3 Disable Intel atomic ops on mingw32
Apparently the configure test is not enough...
2012-06-05 20:39:07 -04:00
Behdad Esfahbod 0594a24484 Cleanup TRUE/FALSE vs true/false 2012-06-05 20:35:40 -04:00
Behdad Esfahbod e1ac38f8dd Fix inert buffer set_length() with zero
Oops!
2012-06-05 20:31:49 -04:00
Behdad Esfahbod 04bc1eebe7 Add configure tests for Intel atomic intrinsics 2012-06-05 20:16:56 -04:00
Behdad Esfahbod f64b2ebf82 Remove last static initializer
We're free!  Lazy or immediate...
2012-06-05 20:15:27 -04:00
Behdad Esfahbod 04aed572f1 Make hb-ft static-initializer free 2012-06-05 18:45:36 -04:00
Behdad Esfahbod be4560a3b5 Undo default unicode-funcs to avoid static initializer again 2012-06-05 18:43:57 -04:00
Behdad Esfahbod 093171ccec Implement lock-free hb_language_t
Another static-initialization down.  One more to go.
2012-06-05 18:00:45 -04:00
Behdad Esfahbod 6843ce01be Add atomic-pointer functions
Gonig to use these for lock-free linked-lists, to be used for
hb_language_t among other things.
2012-06-05 17:27:20 -04:00
Behdad Esfahbod cdafe3a7d8 Add gcc intrinsics implementations for atomic and mutex 2012-06-05 16:40:23 -04:00
Behdad Esfahbod d970d2899b Add gcc implementation for atomic ops 2012-06-05 16:06:28 -04:00
Behdad Esfahbod 0e253e97af Add a mutex to object header
Removes one more static-initialization.  A few more to go.
2012-06-05 15:54:43 -04:00
Behdad Esfahbod a2b471df82 Remove static initializers from indic 2012-06-05 15:17:44 -04:00
Behdad Esfahbod f06ab8a426 Better hide nil objects and make them const 2012-06-05 14:49:14 -04:00
Behdad Esfahbod bf93b636c4 Remove constructor from hb_prealloced_array_t
This was causing all object types to be non-POD and have static
initializers.  We don't need that!

Now, most nil objects just moved from .bss to .data.  Fixing for that
coming soon.
2012-06-05 14:17:32 -04:00
Behdad Esfahbod f1971a2174 Fix warnings 2012-06-05 14:06:04 -04:00
Behdad Esfahbod 9fc7a11469 Remove comma at the end of enum
As reported by Jonathan Kew on the list.
2012-06-04 08:28:19 -04:00
Behdad Esfahbod 3b8fd9c48f Remove const from ref_count.ref_count
According to Tom Hacohen this was breaking build with some compilers.

In file included from hb-buffer-private.hh:35:0,
                 from hb-ot-map-private.hh:32,
                 from hb-ot-shape-private.hh:32,
                 from hb-ot-shape.cc:29:
hb-object-private.hh: In constructor '_hb_object_header_t::_hb_object_header_t()':
hb-object-private.hh:97:8: error: uninitialized const member in 'struct hb_reference_count_t'
hb-object-private.hh:51:25: note: 'hb_reference_count_t::ref_count' should be initialized
In file included from hb-ot-shape.cc:33:0:
hb-set-private.hh: In constructor '_hb_set_t::_hb_set_t()':
hb-set-private.hh:37:8: note: synthesized method '_hb_object_header_t::_hb_object_header_t()' first required here
hb-ot-shape.cc: In function 'void hb_ot_shape_glyphs_closure(hb_font_t*, hb_buffer_t*, const hb_feature_t*, unsigned int, hb_set_t*)':
hb-ot-shape.cc:521:12: note: synthesized method '_hb_set_t::_hb_set_t()' first required here
2012-06-03 15:54:19 -04:00
Behdad Esfahbod 70600dbf62 Minor 2012-06-03 15:52:51 -04:00
Behdad Esfahbod 96a9ef0c9f Remove tab character like other "zero-width" characters
Uniscribe does that, this make comparing results to Uniscribe
easier.
2012-06-01 13:46:26 -04:00
Behdad Esfahbod 0558d55bac Remove hb_atomic_int_set/get()
We never use them in fact...

I'm just adjusting these as I better understand the requirements of
the code and the guarantees of each operation.
2012-05-28 10:46:47 -04:00
Behdad Esfahbod bce095524b Add hb_font_get_glyph_name() and hb_font_get_glyph_from_name() 2012-05-28 10:45:50 -04:00
Behdad Esfahbod bc145658bd Warn if no Unicode functions implementation is found 2012-05-28 10:45:50 -04:00
Behdad Esfahbod a3547330fa Cleanup atomic ops on OS X 2012-05-27 10:20:47 -04:00
Behdad Esfahbod e4b6d503c5 Don't use atomic ops in hb_cache_t
We don't care about linearizability, so unprotected int read/write
are enough, no need for expensive memory barriers.  It's a cache,
that's all.
2012-05-27 10:11:13 -04:00
Behdad Esfahbod 819faa0530 Minor 2012-05-27 10:09:18 -04:00
Behdad Esfahbod 303d5850ec Fix Windows atomic get/set
According to:
http://msdn.microsoft.com/en-us/library/65tt87y8.aspx

MemoryBarrier() is the right macro to protect these, not _ReadBarrier()
and/or _WriteBarrier().
2012-05-27 10:01:13 -04:00
Behdad Esfahbod 29ce446d31 Add set iterator 2012-05-25 14:17:54 -04:00
Behdad Esfahbod 62c3e111fc Add set symmetric difference 2012-05-25 13:48:00 -04:00
Behdad Esfahbod 27aba594c9 Minor 2012-05-24 15:00:01 -04:00
Behdad Esfahbod cde1c0114b Fix hb_atomic_int_set() implementation for HB_NO_MT
As pointed out by Jonathan Kew.
2012-05-24 10:46:39 -04:00
Behdad Esfahbod ed2f1363a3 Fix substitution glyph class propagation
The old code was doing nothing.

Still got to find an example font+string that makes this matter, but
need this for fixing synthetic GDEF anyway.
2012-05-22 22:12:22 -04:00
Behdad Esfahbod 20fdb0f41d Add a lock-free cache type for int->int functions
To be used for cmap and advance caching if desired.
2012-05-17 22:04:45 -04:00
Behdad Esfahbod bd908b4f10 Implement hb_atomic_int_set() for OS X 2012-05-17 22:02:08 -04:00
Behdad Esfahbod 022a05ae90 Minor 2012-05-17 21:53:24 -04:00
Behdad Esfahbod 22afd66a30 Add hb_atomic_int_set() again 2012-05-17 21:23:49 -04:00
Behdad Esfahbod 4aa7258cb1 Fix type conflicts on Windows without glib 2012-05-17 21:01:04 -04:00
Behdad Esfahbod f039e79d54 Don't use min/max as function names
They can be macros on some systems.  Eg. mingw32.
2012-05-17 20:55:12 -04:00
Behdad Esfahbod 34961e3198 Prefer native atomic/mutex ops to glib's 2012-05-17 20:50:38 -04:00
Behdad Esfahbod ec3ba4b96f Move atomic ops into their own header 2012-05-17 20:30:46 -04:00
Behdad Esfahbod 1d6846db9e [Indic] Apply vatu feature after cjct
Testing with old Deva spec this reduces failures.
Test sequence: U+0915,U+094D,U+0930.
2012-05-13 18:09:29 +02:00
Behdad Esfahbod 617f4ac46f Refactor 2012-05-13 16:48:03 +02:00
Behdad Esfahbod 5e4e21fce4 Revert "[Indic] Refactoring"
This reverts commit 0831061efb.
2012-05-13 16:46:08 +02:00
Behdad Esfahbod 3f18236a03 Fix more warnings 2012-05-13 16:20:10 +02:00
Behdad Esfahbod 9f377ed321 Fix more unused-var warnings 2012-05-13 16:13:44 +02:00
Behdad Esfahbod d993e72331 Fix hb_face_set_index() 2012-05-13 16:04:36 +02:00
Behdad Esfahbod 93345edcbe Fix warnings 2012-05-13 16:01:08 +02:00
Behdad Esfahbod eace47b173 Minor 2012-05-13 15:54:43 +02:00
Behdad Esfahbod 99c2695759 Add accessort to buffer for current info, current pos, and prev info 2012-05-13 15:45:18 +02:00
Behdad Esfahbod 6736f3c5b0 Minor 2012-05-13 15:21:06 +02:00
Behdad Esfahbod 5df809b655 [GSUB/GPOS] Remove context_length
The spec doesn't say contextual matching should be done this way,
and AOTS doesn't do it either.  It was inherited from old HarfBuzz.
Remove it.
2012-05-13 15:17:51 +02:00
Behdad Esfahbod 28b9d502bb Minor 2012-05-13 15:04:00 +02:00
Behdad Esfahbod 737dded2e0 Fix compiler warnings 2012-05-12 15:40:11 +02:00
Behdad Esfahbod 7f852b644b Fix compiler warnings 2012-05-11 23:10:31 +02:00
Behdad Esfahbod f7e8dcfd4f [Indic] Unbreak Devanagari
And this, concludes the HarfBuzz Massala Hackfest.

I like to specially thank Jonathan Kew for doing all the decription and
letting me get commit points.
2012-05-11 22:01:33 +02:00
Behdad Esfahbod 6a091df9b4 [Indic] Disambiguate sub vs post vs above matras
Bengali is at *just* above 5% now.
2012-05-11 21:42:27 +02:00
Behdad Esfahbod 9d0d319a4a [Indic] Position Bengali Reph before matras 2012-05-11 21:36:32 +02:00
Behdad Esfahbod f893672511 [Indic] Start categorizing Reph per script 2012-05-11 21:10:03 +02:00
Behdad Esfahbod a913b024d8 [Indic] Apply 'init' feature for Bengali
Error down from 20% to 7%.
2012-05-11 20:59:26 +02:00
Behdad Esfahbod eed903b164 [Indic] Refactor for the arrival of 'init' feature
Yep, on Bengali now!
2012-05-11 20:50:53 +02:00
Behdad Esfahbod 18c06e189b [Indic] Add Uniscribe bug feature for dotted circle
For dotted-circle independent clusters, Uniscribe does no Reph shaping
for the exact sequence Ra+Halant+25CC.  Which also is the only possible
sequence with 25CC at the end.
2012-05-11 20:02:14 +02:00
Behdad Esfahbod 0831061efb [Indic] Refactoring 2012-05-11 19:07:58 +02:00
Behdad Esfahbod 7ea58db311 Minor 2012-05-11 18:58:57 +02:00
Behdad Esfahbod 9c09928989 [Indic] Allow multiple Consonants in Vowel/NBSP syllables
Uniscribe allows multiple Halant+Consonant after a Vowel.
Tests:
↦       * U+0905,U+094D,U+092B,U+094D,930,94d,930
2012-05-11 18:46:35 +02:00
Behdad Esfahbod 8c0aa486f3 [Indic] Allow two Nuktas per consonant
Uniscribe allows up to two nuktas per consonant and one per matra. It does so
indepent of whether the consonant already has a nukta in it.  Tests:

        * U+0916,U+093C,U+0941
        * U+0959,U+093C,U+0941
        * U+0916,U+093C,U+093C,U+0941
        * U+0959,U+093C,U+093C,U+0941
        * U+0916,U+093C,U+093C,U+093C,U+0941
        * U+0959,U+093C,U+093C,U+093C,U+0941
        * 915,93c,93c,,94d,U+0916,U+093C,U+093C,U+093e,93c,93c
2012-05-11 18:13:42 +02:00
Behdad Esfahbod 3399a06e70 [Indic] Fix U+0952 and similar classification to match Uniscribe
See comments.
2012-05-11 17:54:26 +02:00
Behdad Esfahbod 11aa3ef18d [Indic] Treat U+0951..U+0954 all similar to U+0952 2012-05-11 17:30:48 +02:00
Behdad Esfahbod 5f131d3226 [GSUB/GPOS/Indic] Apply GSUB/GPOS within syllables only
This does not apply to the context matchings.

This regresses tests right now.  And we are not sure whether this is
the right thing to do for GPOS.  But we'll figure out.
2012-05-11 17:29:40 +02:00
Behdad Esfahbod 8fd83aaf6e [GSUB/GPOS] Fix wrong buffer access in backward skippy mask matching 2012-05-11 17:18:37 +02:00
Behdad Esfahbod ff24d1081a [Indic] Don't use syllable serial value 0 2012-05-11 17:07:08 +02:00
Behdad Esfahbod 892eb78782 [Indic] Implement Uniscribe Reph+Matra+Halant bug feature 2012-05-11 16:54:40 +02:00
Behdad Esfahbod 67ea29af49 [Indic] Add example of different Uniscribe behavior 2012-05-11 16:51:23 +02:00
Behdad Esfahbod ebe29733d4 [Indic] Add runtime Uniscribe bug compatibility mode!
Enable by setting envvar:

  HB_OT_INDIC_OPTIONS=uniscribe-bug-compatible

Plus, LeftMatra+Halant "feature".
2012-05-11 16:43:12 +02:00
Behdad Esfahbod 616e692e29 [Indic] Add #define UNISCRIBE_BUG_COMPATIBLE 1 2012-05-11 16:25:02 +02:00
Behdad Esfahbod 6782bdae3b [Indic] Fix Left Matra + Halant reordering
As can be seen in: U+092B,U+093F,U+094D
2012-05-11 16:23:43 +02:00
Behdad Esfahbod 3c2ea9481b Minor 2012-05-11 16:23:38 +02:00
Behdad Esfahbod 203d71069c [GSUB/GPOS] Check all glyph masks when matching input 2012-05-11 16:01:44 +02:00
Behdad Esfahbod 668c6046c1 [Indic] Apply Reph mask to all POS_REPH glyphs
Needed for upcoming changes to GSUB/GPOS mask matching.
2012-05-11 15:34:13 +02:00
Behdad Esfahbod 4be46bade2 [Indic] Fix state machine to backtrack 2012-05-11 14:39:01 +02:00
Behdad Esfahbod cee7187447 [Indic] Move syllable tracking from Indic to generic layer
This is to incorporate it into GSUB/GPOS processing.
2012-05-11 11:41:39 +02:00
Behdad Esfahbod 3bf27a9f0e [Indic] Disable conjuncts when a ZWJ happens
Not that the code makes any difference since the presence of ZWJ itself
causes the ligature to fail to match anyway.
2012-05-11 11:17:23 +02:00
Behdad Esfahbod c6d904d67d [Indic] Fix bitops typo!
Another 1000 down!
2012-05-11 11:07:40 +02:00
Behdad Esfahbod 55fe2cf79b Make APPLY debug output print current index and codepoint
Yay!
2012-05-11 03:56:33 +02:00
Behdad Esfahbod 7bd2b04fea Minor 2012-05-11 03:40:58 +02:00
Behdad Esfahbod cf26510dbb Some more...
Done.  I promise.
2012-05-11 03:35:08 +02:00
Behdad Esfahbod 9659523ca3 More beauty in debug output! 2012-05-11 03:33:36 +02:00
Behdad Esfahbod cf26e88a5a Finish off debug output beautification 2012-05-11 03:16:57 +02:00
Behdad Esfahbod d7bba01a35 Only print class name in debug output if there's one available 2012-05-11 02:46:26 +02:00
Behdad Esfahbod 85f73fa8da Only printout class name in tracing, if one is available
Makes debug output much more pleasant.
2012-05-11 02:40:42 +02:00
Behdad Esfahbod 98619ce4fa Minor 2012-05-11 02:34:06 +02:00
Behdad Esfahbod acea183e98 Add return annotation for APPLY 2012-05-11 02:33:11 +02:00
Behdad Esfahbod 5ccfe8e215 /Minor/ 2012-05-11 02:19:41 +02:00
Behdad Esfahbod 0ab8c86217 Annotate SANITIZE return values
More to come, for APPLY, CLOSURE, etc.
2012-05-11 02:11:52 +02:00
Behdad Esfahbod 829e814ff3 Minor 2012-05-11 00:52:16 +02:00
Behdad Esfahbod 6eec6f406d Code reshuffling 2012-05-11 00:50:38 +02:00
Behdad Esfahbod 1e08830b4f Beautify debug output 2012-05-11 00:43:57 +02:00
Behdad Esfahbod 6f45538017 More massaging trace messaging 2012-05-10 23:24:43 +02:00
Behdad Esfahbod b5fa37cb69 Minor 2012-05-10 23:09:48 +02:00
Behdad Esfahbod 208109703c Better trace message support infrastructure
We have varargs in the trace interface now.  To be used soon...
2012-05-10 23:06:58 +02:00
Behdad Esfahbod 02b2922fbf [Indic] Towards better Reph positioning
Fixed for Deva cases with two full-form consonants.  Failures **way** down.
Not much left to go :-).
2012-05-10 21:44:50 +02:00
Behdad Esfahbod 74e54cf446 [Indic] Add Ra back for scripts without Reph
We now check that the 'rphp' table exists before forming Reph, so
we don't need to comment out Ra for those scripts.
2012-05-10 21:22:58 +02:00
Behdad Esfahbod 2b70df5cc0 [Indic] Add note re Uniscribe clusters 2012-05-10 18:38:22 +02:00
Behdad Esfahbod 21d2803133 [Indic] Do clustering like Uniscribe does
Hindi Wikipedia failures down to 6639 (0.938381%)!
2012-05-10 18:34:34 +02:00
Behdad Esfahbod 8df5636968 [Indic] Reorder Reph to before the Halant after Matras
Uniscribe doesn't do it, but we want to do as it gives the Reph the
opportunity to interact with the Matras.  Test with mangal for example.
Sequence: <0930,094d,0915,094b,094d>
In test suite already.
2012-05-10 15:41:04 +02:00
Behdad Esfahbod daf3234bdc [Indic] Don't clear the mask for Reph
This was removing the mandatory global 1 bit in the mask and hence
disabling GPOS for Reph!
2012-05-10 15:28:27 +02:00
Behdad Esfahbod 7708ee23cb [Indic] Improve Left Matra repositioning
Move its dependents too.
2012-05-10 14:48:25 +02:00
Behdad Esfahbod dbb105883c [Indic] Do Reph repositioning in final reordering like the spec says
This introduced a failure, which we tracked down to a test case like this:

  U+092E,U+094B,U+094D,U+0930

The final character is a Ra that should be put in a syllable of it's
own.  And we do.  But it will interact with the Halant before it.  So
now we finally are convinced that we have to limit features to syllable
boundaries.  That's coming after lunch!
2012-05-10 13:45:52 +02:00
Behdad Esfahbod 4705a70269 Minor 2012-05-10 13:09:08 +02:00
Behdad Esfahbod 4ac9e98d9d [Indic] Reorder left matras to be closer to base 2012-05-10 12:53:53 +02:00
Behdad Esfahbod 1a1fa8c655 [Indic] Treat the standalone cluster case reusing the consonant logic 2012-05-10 12:21:30 +02:00
Behdad Esfahbod 190eb31a16 [Indic] Minor 2012-05-10 12:21:30 +02:00
Behdad Esfahbod c5306b6861 [Indic] Handle Vowel syllables
Reusing the consonant logic!
2012-05-10 12:21:30 +02:00
Behdad Esfahbod 6d8e0cb74c [Indic] Simplify Reph logic 2012-05-10 11:41:51 +02:00
Behdad Esfahbod 3d25079f8d [Indic] Don't form Reph is Ra is the only consonant in the syllable 2012-05-10 11:37:42 +02:00
Behdad Esfahbod b99d63ae11 [Indic] Increase max syllable length
20 was way too low, one could hit a syllable with 7ish consonants with it.
2012-05-10 11:32:52 +02:00
Behdad Esfahbod a391ff50b9 [Indic] Adjust base after sorting 2012-05-10 11:31:20 +02:00
Behdad Esfahbod d3637edb24 [Indic] Don't return for long syllables. Just not sort. 2012-05-10 10:51:38 +02:00