Behdad Esfahbod
6411e74caf
[Indic] Reposition Gurmukhi top matras to after post
...
The font is forming a post-base consonant in some samples, and Uniscribe
positions top matra on the post-base. Do the same.
Gurmukhi failures down from 59 to 41 (0.0674242%).
2012-07-24 13:48:49 -04:00
Behdad Esfahbod
65c43accdc
[Indic] Better position left-matra in Malayalam
...
Just put it before base, which is what's expected.
Malayalam failures down from 1559 to 1197 (0.114172%).
BENGALI: 353988 out of 354285 tests passed. 297 failed (0.0838308%)
DEVANAGARI: 693571 out of 693628 tests passed. 57 failed (0.00821766%)
GUJARATI: 366489 out of 366506 tests passed. 17 failed (0.0046384%)
GURMUKHI: 60750 out of 60809 tests passed. 59 failed (0.0970251%)
KANNADA: 950956 out of 951913 tests passed. 957 failed (0.100534%)
KHMER: 299094 out of 299124 tests passed. 30 failed (0.0100293%)
MALAYALAM: 1047219 out of 1048416 tests passed. 1197 failed (0.114172%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271699 out of 271847 tests passed. 148 failed (0.0544424%)
TAMIL: 1091837 out of 1091837 tests passed. 0 failed (0%)
TELUGU: 970524 out of 970573 tests passed. 49 failed (0.00504856%)
2012-07-24 03:36:47 -04:00
Behdad Esfahbod
88f413b56f
[Indic] Implement Reph+Ya-Phalaa interaction
...
The sequence Ra,H,Ya in Bengali is ambigious and Unicode encoded that to
get Ya-Phalaa, one would place ZWJ before Halant. Ie. a ZWJ,H sequence
requests subjoining, while a H,ZWJ requests Half form. Implement that.
Bengali failures go down from 377 to 297 (0.0838308%).
Gujarati is down by 4 to 17 (0.0046384%).
Kannada is down by 226 to 957 (0.100534%).
Current status:
BENGALI: 353988 out of 354285 tests passed. 297 failed (0.0838308%)
DEVANAGARI: 693571 out of 693628 tests passed. 57 failed (0.00821766%)
GUJARATI: 366489 out of 366506 tests passed. 17 failed (0.0046384%)
GURMUKHI: 60750 out of 60809 tests passed. 59 failed (0.0970251%)
KANNADA: 950956 out of 951913 tests passed. 957 failed (0.100534%)
KHMER: 299094 out of 299124 tests passed. 30 failed (0.0100293%)
MALAYALAM: 1046857 out of 1048416 tests passed. 1559 failed (0.148701%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271699 out of 271847 tests passed. 148 failed (0.0544424%)
TAMIL: 1091837 out of 1091837 tests passed. 0 failed (0%)
TELUGU: 970524 out of 970573 tests passed. 49 failed (0.00504856%)
2012-07-24 03:04:36 -04:00
Behdad Esfahbod
dff0ece11d
[Indic] Limit matras to 4 per syllable
...
Also limit joiners.
This limits our syllable length to a constant, and is
closer to what Uniscribe does anyway.
Two Devanagari tests regressed, but who cares about tests with 20
joiners in a row?! Devanagari at 57 (0.00821766%) now.
2012-07-24 02:37:42 -04:00
Behdad Esfahbod
330b329c89
[Indic] Unmark U+17D1 KHMER SIGN VIRIAM to NOT be a Virama
...
Fixes another 1 Khmer failure. Down to 30 (0.0100293%) now.
2012-07-24 02:25:26 -04:00
Behdad Esfahbod
6824a7194e
[Indic] Recategorize Khmer various signs as top matras
...
Khmer failures down from 39 to 31 (0.0103636%).
2012-07-24 02:22:18 -04:00
Behdad Esfahbod
d90b8e841e
[Indic] Reposition Khmer prebase-reordering Ra around split matras
...
In Khmer coeng model, a V,Ra can go *after* matras. If it goes after a
split matra, it should be reordered to *before* the left part of such matra.
Khmer failures down from 136 to 39 (0.0130381%).
2012-07-24 02:11:18 -04:00
Behdad Esfahbod
0afb84c125
[Indic] Fix minor bug in pre-base Ra positioning
2012-07-24 01:44:47 -04:00
Behdad Esfahbod
7573799126
[Indic] Position Khmer U+17CE
...
Fixes another 6 Khmer failures. Now at 136 (0.0454661%).
2012-07-24 01:32:07 -04:00
Behdad Esfahbod
8d00e8d0e7
[Indic] Don't reposition Khmer Bindu
...
Khmer Bindu doesn't like to move to syllable end. Leave it where it
was.
Brings down Khmer failures from 510 to 142 (0.047572%).
2012-07-24 01:15:34 -04:00
Behdad Esfahbod
2278eefcdb
[Indic] In Sinhala, form forced Reph even if no other consonant found
...
Fixes another 10 Sinhala failures. Down to 148 (0.0544424%).
2012-07-24 00:31:10 -04:00
Behdad Esfahbod
71fd5e80ad
[Indic] Further adjust base algorithm for Sinhala
...
Apparently if there is C,V,ZWJ,C, the first C will be base, but if
it's C,ZWJ,V,C, the second one will be.
Note that Uniscribe implements this differently, by breaking syllable in
the case of C,ZWJ,V,C and putting the first consonant in one syllable
and the rest in the next syllable.
Sinhala failures down from 208 to 158 (0.0581209%). No changes to
Khmer.
2012-07-24 00:21:16 -04:00
Behdad Esfahbod
73d71cc527
[Indic] End Vowel-based syllable at ZWJ
...
One Devanagari test regressed, plus 10 Malayalam (at 1545 now).
Fixed 120 Sinhala failures. Now at 208 (0.0765136%).
2012-07-24 00:09:12 -04:00
Behdad Esfahbod
34c215036f
[Indic] Improve Sinhala base algorithm and reph positioning
...
Sinhala does not have half forms. And most (all?) consonants can be
base, except when preceded by ZWJ, which would request a subjoined form.
Hence switch the base algorithm to categorize with Khmer, start search
at start, and stop at a ZWJ.
Also, mark all pos=base consonants after base to be subjoined. Mark
base itself to have pos=base.
Finally, adjust Sinhala's reph position to after-main.
Brings down Sinhala failures from 455 to 328 (0.120656%).
2012-07-23 23:51:29 -04:00
Behdad Esfahbod
2ec934c6c2
[Indic] Change "unknown" position to end of syllable
2012-07-23 23:49:04 -04:00
Behdad Esfahbod
b70021f7c8
When removing zero-width marks, don't remove ligatures
...
If a mark ligated, it probably should NOT be removed.
2012-07-23 20:18:17 -04:00
Behdad Esfahbod
49c5ec5144
Minor refactoring
2012-07-23 20:14:13 -04:00
Behdad Esfahbod
c3e6fdc379
[Indic] Improve check on ligatures
...
Only skip actual ligatures, not marks in-between ligature components.
2012-07-23 20:11:42 -04:00
Behdad Esfahbod
771a8f5028
[Indic] exclude ligatures when matching on Indic category
...
If, say, a H,ZWJ,C ligature was formed, we don't want the code to detec
that as a Halant. So, ignore ligatures when matching category in
final_reordering.
Sinhala failures down from 514 to 455 (0.167374%).
2012-07-23 20:09:30 -04:00
Behdad Esfahbod
d1af9e82e5
[GSUB/GPOS] Const correctness
2012-07-23 19:55:35 -04:00
Behdad Esfahbod
baacd090df
[Indic] Minor refactoring
2012-07-23 19:51:48 -04:00
Behdad Esfahbod
c7c4de2fb9
[Indic] Remove syllable length check before sorting
...
We now limit syllable lengths in the machine. No need to match here.
2012-07-23 18:25:02 -04:00
Behdad Esfahbod
9fa052733e
[Indic] Limit syllables to at most five consonants
...
Seems to be about what Uniscribe does. Not exactly. But close enough.
More consonants will start a new cluster.
A few scripts went way down in failures. In particular:
- Devanagari failures went down from 490 to 56.
- Telugu went down from 113 to 49.
Other scripts went down slightly or didn't change. New numbers:
BENGALI: 353908 out of 354285 tests passed. 377 failed (0.106412%)
DEVANAGARI: 693572 out of 693628 tests passed. 56 failed (0.00807349%)
GUJARATI: 366485 out of 366506 tests passed. 21 failed (0.00572978%)
GURMUKHI: 60750 out of 60809 tests passed. 59 failed (0.0970251%)
KANNADA: 950730 out of 951913 tests passed. 1183 failed (0.124276%)
KHMER: 298613 out of 299124 tests passed. 511 failed (0.170832%)
MALAYALAM: 1046881 out of 1048416 tests passed. 1535 failed (0.146411%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271333 out of 271847 tests passed. 514 failed (0.189077%)
TAMIL: 1091837 out of 1091837 tests passed. 0 failed (0%)
TELUGU: 970524 out of 970573 tests passed. 49 failed (0.00504856%)
Some of the remaining Telugu and Devanagari issues seem to be Uniscribe
eating Anusvara when placed before a non-joiner. Ouch!
2012-07-23 18:19:17 -04:00
Behdad Esfahbod
093cd58326
[Thai] Fix SARA AM handling
...
Oops, thinko.
2012-07-23 14:04:42 -04:00
Behdad Esfahbod
42848453bf
[Thai] Reorder U+0E3A THAI VOWEL SIGN PHINTHU
...
Uniscribe reorders U+0E3A to be after U+0E38 and U+0E39. We do that by
modifying the ccc for U+0E3A.
Fixes the two remaining Thai failures (see previous commit).
2012-07-23 13:52:07 -04:00
Behdad Esfahbod
4a7f4f3e56
[Thai] Adjust SARA AM reordering to match Uniscribe
...
Adjust the list of marks before SARA AM that get the reordering
treatment. Also adjust cluster formation to match Uniscribe.
With Wikipedia test data, now I see:
- For Thai, with the Angsana New font from Win7, I see 54 failures out
of over 4M tests (0.00129107%). Of the 54, two are legitimate
reordering issues (fix coming soon), and the other 52 are simply
Uniscribe using a zero-width space char instead of an unknown
character for missing glyphs. No idea why. The missing-glyph
sequences include one that is a Thai character followed by an Arabic
Sokun. Someone confused it with Nikhahit I assume!
- For Lao, with the Dokchampa font from Win7, 33 tests fail out of
54k (0.0615167%). All seem to be insignificant mark positioning
with two marks on a base. Have to investigate.
2012-07-23 13:15:33 -04:00
Behdad Esfahbod
2cc933aff9
[Indic] Fix cluster formation with left-matras and conjunct forms
...
Test case was: <U+0D15,U+0D4D,U+0D15,U+0D4A>.
2012-07-23 08:23:44 -04:00
Behdad Esfahbod
e6b01a878c
[Indic] Further streamline cluster formation
...
This should address all possible cluster misformations that I had in
mind.
2012-07-23 00:11:26 -04:00
Behdad Esfahbod
7b2a7dadd6
[Indic] Merge clusters before sorting
...
This should fix any instabilities in cluster formation that we were
speculating may happen with surrounding syllables. Or most of it
perhaps.
2012-07-22 23:58:55 -04:00
Behdad Esfahbod
abb3239ef9
[Indic] Update clusters for left-matra even if matra didn't move
...
Fixes crashes reported with left matra under
non-uniscribe-bug-compatibilty mode.
2012-07-22 23:55:19 -04:00
Behdad Esfahbod
92a1ad7bef
[Indic] Stop searching for base if a post form is found before below form
...
Improves Bengali and Gurmukhi. Malayalam regressed a bit. We will deal
with that later.
2012-07-20 18:55:15 -04:00
Behdad Esfahbod
4c450c703f
[Indic] Recompose Bengali Ya,Nukta
...
This is a bunch of hacks for now.
Improves Bengali a bit.
2012-07-20 18:13:04 -04:00
Behdad Esfahbod
e9c0f152a3
[Uniscribe] Fix script fallback
...
Gurmukhi failures half now. Others changed slightly.
2012-07-20 17:37:48 -04:00
Behdad Esfahbod
5791f32915
[Indic] Allow a ZWNJ after SM's
...
Malayalam failures go way down. Other scripts benefitted slightly too.
Sinhala had one or two test regressions, but...
2012-07-20 16:26:55 -04:00
Behdad Esfahbod
34ae336f3f
[Indic] Improve Reph AfterMain positioning
...
Fixes 20 out of 48 failing Oriya tests. Failure rate down to 0.066% now.
2012-07-20 16:17:28 -04:00
Behdad Esfahbod
bdd080431a
[Indic] Reposition Oriya Candrabindu
...
Oriya failures down from 0.65% to 0.20%.
2012-07-20 16:03:09 -04:00
Behdad Esfahbod
5f0eaaad12
[Indic] Fix base search in final_reordering
...
Fixes most Malayalam failures. Down from 1.6% to 0.38% now. Fixes a
few more in other scripts too.
2012-07-20 15:47:24 -04:00
Behdad Esfahbod
81202bd860
[Indic] Don't attach SM/VD to other characters
2012-07-20 15:14:51 -04:00
Behdad Esfahbod
efb4ad7356
Fix compiler warnings
...
If x is not constant, we cannot ASSERT_STATIC on it.
2012-07-20 14:27:38 -04:00
Behdad Esfahbod
f31d97e44e
[Indic] Form Telugu Reph out of Ra,Virama,ZWJ
...
Apparently this was approved in Feb 2012. No font yet.
2012-07-20 14:13:35 -04:00
Behdad Esfahbod
2e193b240e
[Indic] Don't split U+0AC9
...
Althought IndicMatraCategory.txt classifies it as Top_And_Right matra,
it does not have Unicode decomposition, and Uniscribe does not do
anything special about it either.
Gujarati failures down from 0.672% to 0.0130966%.
2012-07-20 14:02:35 -04:00
Behdad Esfahbod
30c3d5e9fc
[Indic] Simplify Uniscribe cluster emulation
...
Now that we break syllables on Halant,ZWNJ, this code can be simplified.
2012-07-20 13:56:32 -04:00
Behdad Esfahbod
decf6ffca4
[Indic] Minor!
2012-07-20 13:51:31 -04:00
Behdad Esfahbod
9e4f94a72c
[Indic] Break syllables at Halant,ZWNJ
...
That's really what Uniscribe does, and explains a lot of pecularities of
Halant,ZWNJ before the base.
Sent Telugu from 1% failures to 0.03%. Improved Kannada and Malayalam
slightly. Fixed half of Bengali, and did NOT break anything!
2012-07-20 13:48:03 -04:00
Behdad Esfahbod
2c372b80f6
[Indic] Better check for applying 'init'
...
Specifically, don't apply 'init' if previous char is a joiner.
Fixes some more of Bengali.
2012-07-20 13:37:48 -04:00
Behdad Esfahbod
34a7440b7c
[GPOS] Don't zero mark advances
...
Fixes more of Telugu, Kannada, and Oriya.
May break things (outside Indic...), but we cannot think of any font relying
on this immediately.
2012-07-20 12:40:39 -04:00
Behdad Esfahbod
8ed248de77
[Indic] Minor
2012-07-20 11:42:24 -04:00
Behdad Esfahbod
d0e68dbd0b
[Indic] Implement reph positioning step 5
...
Not tuned, just copied from step 2. Fixes another 0.5% of Kannada
failures. 1% to go.
2012-07-20 11:25:41 -04:00
Behdad Esfahbod
a9e45c32e4
[Indic] Don't let ZWNJ at the end of syllable affect base search
...
Fixes a few Devanagari, half of remaining Kannada failures, quarter for
Telugu, and others slightly improved or unchanged.
2012-07-20 11:04:15 -04:00
Behdad Esfahbod
20b68e699f
[Indic] Apply 'cjct' globally
...
Fixes 5 Devanagari failures, and no regressions.
2012-07-20 10:47:46 -04:00
Behdad Esfahbod
51e764de44
[Indic] Unbreak old scriptures
...
Brings down failures with Lohit-Telugu from 57% to 1.40%.
2012-07-20 10:30:24 -04:00
Behdad Esfahbod
900cf3d449
Minor
2012-07-20 10:18:23 -04:00
Behdad Esfahbod
87cd63266e
[Indic] Recategorize some Kannada right matras
...
Kannada failures down from 3.5% to 2.93%.
2012-07-19 21:25:46 -04:00
Behdad Esfahbod
3604d64ced
[Indic] Recategorize GURMUKHI ADDAK
...
It's not in IndicSyllabicCategory.txt. Fixes most of Gurmukhi failures.
Failures down from 7.7% to 0.222%!
2012-07-19 21:13:04 -04:00
Behdad Esfahbod
8932858123
Minor
2012-07-19 21:02:38 -04:00
Behdad Esfahbod
47ef931f13
[buffer] Make sure out_info = info during GPOS
2012-07-19 20:52:44 -04:00
Behdad Esfahbod
ae63cf2062
Print line number during return when tracing
2012-07-19 20:45:41 -04:00
Behdad Esfahbod
5249f3aee1
[Indic] Unbreak Khmer
...
For Khmer, all consonants are subjoining. No need to look in the font.
We were looking in the wrong order anyway.
2012-07-19 20:30:22 -04:00
Behdad Esfahbod
e0475345d5
[Indic] Apply 'akhn' globally
...
Fixes 1.5% more failures for Telugu, 2% for Kannada.
Breaks one test in Devanagari.
2012-07-19 20:24:14 -04:00
Behdad Esfahbod
fa247ebe52
[Indic] Better position U+0CD5
...
Fixes another 5% of Kannada failures.
2012-07-19 19:52:19 -04:00
Behdad Esfahbod
f055442716
[Indic] Lookup consonant position in the font
...
Fixes most failures of Oriya, and improves others a bit.
2012-07-19 16:20:21 -04:00
Behdad Esfahbod
74d1d88781
[GSUB] Fix would_apply() for LigatureSubst
2012-07-19 16:14:23 -04:00
Behdad Esfahbod
be73a5f936
Add src/test-would-substitute tool
2012-07-19 15:12:18 -04:00
Behdad Esfahbod
e72b360ac6
Refactor / finish would_apply() operation
...
Untested.
2012-07-19 14:44:46 -04:00
Behdad Esfahbod
8c973ebf0f
[Indic] Implement per-script matra positioning
...
Following what the spec says.
Brings down Telugu failures from 40% to 3.75%, and Kannada failures from
44% to 10%. Does NOT affect other scripts' test results.
2012-07-19 13:25:08 -04:00
Behdad Esfahbod
8bb32458f9
[Indic] More refactoring
2012-07-19 13:04:44 -04:00
Behdad Esfahbod
9ccc6382ba
[Indic] Minor refactoring
2012-07-19 12:45:31 -04:00
Behdad Esfahbod
f83aaa3133
[Indic] Minor
2012-07-19 12:23:23 -04:00
Behdad Esfahbod
be8b9f5f71
[Indic] Start refactoring different matra positions per script
2012-07-19 12:11:12 -04:00
Behdad Esfahbod
b01d9b3d90
[Indic] Disallow decomposition of a couple characters
...
This is a hack for now. Will be fixed when we do complex-shaper-driven
normalization properly.
The results with or without decomposition are the same, but Uniscribe
does not normalize, so this matches better.
2012-07-19 11:25:49 -04:00
Behdad Esfahbod
422ecd2d3c
[Indic] Accept a forced Rakar sequence at the end of syllable
...
In Sinhala, Rakar is formed by Al-Lakuna,ZWJ,Ra. If you put that at the
end of a Consonant,Matra syllable, you get a dotted-circle from
Uniscribe. Apparently adding a ZWJ before the Al-Lakuna "fixes" that.
And people have been encoding that sequence... So, allow a forced
"ZWJ,Virama,ZWJ,Ra" sequence at the of syllables.
Fixes some 100 or more of Sinhala failures. Now at 622 only (0.23%).
2012-07-18 23:25:58 -04:00
Behdad Esfahbod
6fc1732003
[Indic] Allow joiners on both sides of Halant at the same time
...
The sequence <ZWJ,Al-Lakuna,ZWJ> is used in Sinhala to explicitly ask
for Rakar. Fixes two-thousand Sinhala tests. Not many left.
2012-07-18 17:49:19 -04:00
Behdad Esfahbod
10cdc94eee
[Indic] In final reordering, find base, even if it disappeared
...
POS_BASE can disappear if base ligated backward. Define base as last
with position not after base.
Fixes a few hundred of Sinhala failures with Iskoola Pota.
2012-07-18 17:43:23 -04:00
Behdad Esfahbod
9c4d24a3a6
[Indic] Minor
2012-07-18 17:29:10 -04:00
Behdad Esfahbod
3285e107c9
[Indic] Implement Sinhala "Al Lakuna" Reph behavior
...
In Sinhala, Reph is formed only explicitly, by the presence of a ZWJ.
2012-07-18 17:22:14 -04:00
Behdad Esfahbod
91cade7555
[Indic/Unicode] Decompose Sinhala split matras the way Uniscribe likes
...
Makes no visual difference.
Fixes most of the failures. Down from 15% to 1.3%!
2012-07-18 16:50:41 -04:00
Behdad Esfahbod
d8942dcbb4
Apply Tibetan (global) features.
...
Fixes all Tibetan failures. All 180k of them!
Merges back Hangul into the default shaper.
2012-07-18 16:34:10 -04:00
Behdad Esfahbod
552d19b7a1
[Indic] Treat Register Shifters like Nukta
...
Really this time.
Fixes another 18 Khmer tests.
2012-07-18 16:02:33 -04:00
Behdad Esfahbod
e8cd81f76d
[Indic] Minor
2012-07-18 16:00:20 -04:00
Behdad Esfahbod
69f26bf39c
[Indic] Fix Matra reordering when base is at end of syllable
...
For example: U+915,U+200c,U+93f
Fixes last Tamil failure!
2012-07-18 15:47:51 -04:00
Behdad Esfahbod
d16ccc4ae7
Leave one extra item at the end of buffer allocation
...
Just in case, for the times we do out-of-bounds access.
jk
2012-07-18 15:43:55 -04:00
Behdad Esfahbod
075d671f10
[Indic] Fix out-of-bounds array access
2012-07-18 15:41:53 -04:00
Behdad Esfahbod
dcb527242b
[Indic] Allow joiners before matras
...
Fixes 1 more Devanagari test!
2012-07-18 15:32:26 -04:00
Behdad Esfahbod
391cc03317
[Indic] Allow halant group in Vowel and placeholder syllables
...
Fixes 2 out of 560 Devanagari failures. AND:
Fixes 1 out of 2 Tamil failures.
2012-07-18 15:12:49 -04:00
Behdad Esfahbod
ca4e3d3eab
[Indic] Streamline halant/joiner in grammar
2012-07-18 15:05:40 -04:00
Behdad Esfahbod
418d00dffd
[Indic] Minor
2012-07-18 14:57:28 -04:00
Behdad Esfahbod
4c3691d2a3
[Indic] Hopefully minor!
...
Refactoring Indic machin. No semantic change.
2012-07-18 14:23:55 -04:00
Behdad Esfahbod
e092c556fb
[Indic] Minor
2012-07-18 14:09:25 -04:00
Behdad Esfahbod
14dbdd9e39
[Indic] Unbreak Tamil
...
Tamil has only about 150 failures now!
2012-07-18 13:13:03 -04:00
Behdad Esfahbod
db8981f1e0
[Indic] Position Khmer Robat
...
It's a visual Repha.
Still not positioning logical Repha as occurs in Malayalam.
Another 200 Khmer failures fixed. 547 to go. That's better than
Devanagari!
2012-07-17 23:42:04 -04:00
Behdad Esfahbod
25bc489498
[Indic] Better categorize Register Shifters and Khmer Various signs
...
Down another 500 or so Khmer failures!
2012-07-17 17:53:03 -04:00
Behdad Esfahbod
39b17837b4
Add hb_buffer_normalize_glyphs() and hb-shape --normalize-glyphs
...
This reorders glyphs within the cluster to a nominal order. This should
have no visible effect on the output, but helps with testing, for
getting the same hb-shape output for visually-equal glyphs for each
cluster.
2012-07-17 17:09:29 -04:00
Behdad Esfahbod
25e302da9a
[Indic] Minor
2012-07-17 14:25:14 -04:00
Behdad Esfahbod
5d32690a34
[Indic] For scripts without Half forms, always choose first consonant as base
...
In such scripts (ie. Khmer), a ZWJ/ZWNJ shouldn't stop the search for
base. So, instead just choose the first consonant as base directly.
Test sequence:
U+1798,200c,U+17C9,U+17D2,U+179B,U+17C1,U+17C7
2012-07-17 14:23:28 -04:00
Behdad Esfahbod
34b5714906
[Indic] Treat Khmer Register Shifters more like Nuktas
...
Except that there may be a ZWNJ before a Register Shifter.
2012-07-17 14:09:32 -04:00
Behdad Esfahbod
11e2a601b1
[Indic] Minor
2012-07-17 14:02:28 -04:00
Behdad Esfahbod
0201e0a464
[Indic] Apply 'cfar' for Khmer
...
Mark stuff after a pre-base reordering Ro 'cfar'. Used in Khmer.
This allows distinguishing the following cases with MS Khmer fonts:
U+1784,U+17D2,U+179A,U+17D2,U+1782
U+1784,U+17D2,U+1782,U+17D2,U+179A
2012-07-17 13:56:24 -04:00
Behdad Esfahbod
55f70ebfb9
[Indic] Position final subjoined consonants (and vowels) after matras
...
In Khmer, a final subjoined consonant or independent vowel can occur
after matras. This final subjoined thing should NOT be reordered to
before the matra even though it's subjoined.
Fixes another 1k of the Khmer failures. Not much left really.
2012-07-17 12:50:13 -04:00
Behdad Esfahbod
c50ed71e9a
[Indic] Recategorize Khmer coeng sign as a separate category OT_Coeng
...
Amend the syllable structure to allow a final subscripted consonant
(Coeng+C) and a final subscripted independent vowel (Coeng+V).
Fixes another 2k of Khmer failures.
2012-07-17 11:54:28 -04:00
Behdad Esfahbod
deb521dee4
[Indic] Add a separate Coeng class
...
No characters recategorized yet. No semantic change.
2012-07-17 11:37:32 -04:00
Behdad Esfahbod
74ccc6a132
[Indic] Move Halant with after-base consonants
...
Normally, we attach the Halant to the previous character and move it
with it. For after-base consonants however, the Halant "belongs" to the
consonant after, so attach it so.
This fixes Bengali sequences involving post-base consonant Ya, which
should ligate with the Halant to form Ya Phala, but previously a
reordered matras was blocking the ligation.
2012-07-17 11:16:19 -04:00
Behdad Esfahbod
d5c4edcdd6
[Indic] Apply presentation-forms features all at once
...
Seems like this is what Uniscribe is doing, and does not break any fonts
we tested (with Devanagari, Malayalam, Khmer, and Bengali), while fixing
some Ra Phala sequences for Bengali with Vrinda. Fixes another 2% of
Bengali failures (a couple more to go).
2012-07-17 10:40:59 -04:00
Behdad Esfahbod
559f706678
Fix MarkAttachmentType matching
...
Fixes issue reported by Khaled Hosny with his Hussaini Nastaleeq font
and sequences like those added in the previous commit.
2012-07-16 22:46:52 -04:00
Behdad Esfahbod
ad4494759f
Minor
2012-07-16 22:40:21 -04:00
Behdad Esfahbod
af92b4cc90
[Indic] Disable 'kern' in Uniscribe bug compatibility mode
...
Uniscribe does not apply 'kern' in the Indic module. Some of the Khmer
fonts they ship have small adjustments in the 'kern' table. Disable
'kern' in the Indic module under Uniscribe bug compatibility mode.
Fixes some 10% of the Khmer failures. Remains under 3% (excluding
dotted-circle ones).
2012-07-16 20:31:24 -04:00
Behdad Esfahbod
d96838ef95
Allow complex shapers overriding common features
...
In a new callback... Currently unused by all complex shapers.
2012-07-16 20:26:57 -04:00
Behdad Esfahbod
df50b84740
[Indic] Categorize other Khmer marks
...
Mark them the same as the Register Shifters for now. Need to rename
that category to something more sensible after all is settled.
Fixes another percent of Khmer failures. Down to under 3%!
2012-07-16 20:14:50 -04:00
Behdad Esfahbod
8e7b5882fb
[Indic] Recognize pre-base reordering Ra anywhere in the syllable
...
We were doing that only immediately after base.
Fixes another percent in the Khmer failures. About three more to go...
2012-07-16 17:04:46 -04:00
Behdad Esfahbod
7d09c98a1f
[Indic] Recognizer Register Shifter marks
...
Fixes another 6% of the Khmer failures.
2012-07-16 16:45:22 -04:00
Behdad Esfahbod
60da763dfa
[GSUB/GDEF] Guess glyph classes after substitution only if no GDEF
...
Brings down Khmer failures with Daun Penh font from 36% to 20%.
2012-07-16 16:14:40 -04:00
Behdad Esfahbod
fcdc5f1c88
[Indic] Categorize Khmer Ro
...
Khmer failures down from 58% to 36%.
2012-07-16 15:52:54 -04:00
Behdad Esfahbod
78818124b1
[Indic] Reoder pre-base reordering Ra
...
Brings down Malayalam failures from 14% down to 3%.
2012-07-16 15:49:08 -04:00
Behdad Esfahbod
1a1dbe9a27
[Indic] Rename
2012-07-16 15:41:33 -04:00
Behdad Esfahbod
46e645ec4b
[Indic] Start implementing pre-base reordering
2012-07-16 15:30:05 -04:00
Behdad Esfahbod
921ce5b17d
[Indic] Rename
...
No semantic change.
2012-07-16 15:26:56 -04:00
Behdad Esfahbod
b504e060f0
[Indic] Implement After-Main Reph positioning
...
Almost...
2012-07-16 15:21:12 -04:00
Behdad Esfahbod
17d7de91d7
[Indic] Apply 'pref' to pre-base reodering Ra
...
No reordering yet.
2012-07-16 15:20:15 -04:00
Behdad Esfahbod
362d3db8d3
[Indic] Minor
...
Should not be any semantic change. In preparation for implementing
pre-base reordering Ra.
2012-07-16 15:15:28 -04:00
Behdad Esfahbod
70fe77bb9a
Minor
2012-07-16 14:52:18 -04:00
Behdad Esfahbod
2f903215c5
Minor
2012-07-16 13:54:43 -04:00
Behdad Esfahbod
a3e04bee2c
[Indic] Reorder virama only for old Indic spec
2012-07-16 13:47:19 -04:00
Behdad Esfahbod
0de771b72d
[Indic] Categorize Khmer consonants
2012-07-16 13:39:36 -04:00
Behdad Esfahbod
d487fff266
Split matras without a Unicode decomposition
...
This is a hack for now, to get us going with Khmer. This will be
refactored properly later to move the complex logic into complex
shapers.
2012-07-16 13:25:57 -04:00
Behdad Esfahbod
8aa801a6fd
[Indic] Adjust position for split matras
...
We are going to split matras without a Unicode decompositions in a way
that the second half takes the codepoint of the whole matra. So,
position them where the second half is supposed to end up.
2012-07-16 13:24:26 -04:00
Behdad Esfahbod
1feb8345a5
[GSUB] Allow 1-to-1 ligature substitutions!
...
Apparently Uniscribe allows these, and they are used in some Khmer fonts
shipped with Windows, namely, Daun Penh.
2012-07-16 13:23:40 -04:00
Behdad Esfahbod
29f106d7fb
[Indic] Apply Above Forms
2012-07-16 12:05:35 -04:00
Behdad Esfahbod
fa2bd9fb63
Further simplify atomic ops on Visual Studio
2012-07-14 12:15:54 -04:00
Behdad Esfahbod
0a49235701
Minor
2012-07-13 13:20:49 -04:00
Behdad Esfahbod
11c4ad439e
Add -Wcast-align
2012-07-13 11:29:31 -04:00
Behdad Esfahbod
a98d0ab186
Make sure HB_BEGIN_DECLS / HB_END_DECLS is only used in public headers
...
So we can use them to switch default visibility to internal if desired,
and use these to make only declared symbols public.
2012-07-13 10:19:10 -04:00
Behdad Esfahbod
5c5bc96216
Allow overriding HB_BEGIN_DECLS / HB_END_DECLS
2012-07-13 10:15:37 -04:00
Behdad Esfahbod
50a4e78b53
Check for exported weak symbols
...
Ouch, all our C++ inline functions are being exported (weakly) already.
Fix coming.
2012-07-13 09:48:39 -04:00
Behdad Esfahbod
b5aeb95afe
Make hb_in_range() static
2012-07-13 09:45:54 -04:00
Behdad Esfahbod
271c8f8907
Minor
2012-07-13 09:32:30 -04:00
Behdad Esfahbod
391f1ff5d8
Fix _InterlockedCompareExchangePointer on x86
2012-07-13 09:04:07 -04:00
Behdad Esfahbod
2023e2b54d
[ft] Disable ppem setting
...
The calculations were wrong.
FreeType makes it really hard to set size and ppem independently.
For now, disable it. Need to come up with a fix later.
2012-07-11 19:01:26 -04:00
Behdad Esfahbod
cdf7444505
[ft] Use unfitted kerning if x_ppem is zero
2012-07-11 18:52:39 -04:00
Behdad Esfahbod
6d08c7f1b3
Revert "Towards templatizing common Lookup types"
...
This reverts commit 727135f3a9
.
This is work-in-progress. Didn't mean to push it out just yet.
2012-07-11 18:01:27 -04:00
Behdad Esfahbod
552bf3a9f9
Bump WINNT version requested from 500 to 600
...
Since we use the OpenType versions of Uniscribe functions, we are
relying on that version of the WINNT API. Otherwise, usp10.h will hide
those symbols.
2012-07-11 18:00:28 -04:00
Behdad Esfahbod
9a5b421a64
Fix build with no Unicode funcs implementations provided
2012-07-11 18:00:28 -04:00
Behdad Esfahbod
727135f3a9
Towards templatizing common Lookup types
2012-07-11 18:00:28 -04:00
Behdad Esfahbod
12f5c0a222
Fix check for Intel atomic ops
2012-06-26 11:16:13 -04:00
Behdad Esfahbod
6932a41fb6
Use octal-escaped UTF-8 characters instead of plain text
...
https://bugs.freedesktop.org/show_bug.cgi?id=50970
2012-06-26 10:46:31 -04:00
Behdad Esfahbod
8c0ea7bcb4
Disable introspection again
...
Until I figure out the build issues. Sigh...
2012-06-24 13:20:56 -04:00
Behdad Esfahbod
49f8e0cd9a
GStaticMutex is deprecated
2012-06-16 15:40:03 -04:00
Behdad Esfahbod
1bc1cb3603
Make source more digestable for gobject-introspection
2012-06-16 15:21:55 -04:00
Behdad Esfahbod
84d781e54c
Flesh out gobject-introspection stuff a bit
2012-06-16 15:21:41 -04:00
Behdad Esfahbod
2cf301968c
Add hb_object_lock/unlock()
2012-06-09 14:58:01 -04:00
Behdad Esfahbod
f211d5c291
More Oops! Fix fast-path with sub-type==0
2012-06-09 03:11:22 -04:00
Behdad Esfahbod
b1de6aa1f3
Oops!
2012-06-09 03:07:59 -04:00
Behdad Esfahbod
b12e2549cb
Minor
2012-06-09 03:05:20 -04:00
Behdad Esfahbod
faf0f20253
Add sanitize() logic for fast-paths
2012-06-09 03:02:36 -04:00
Behdad Esfahbod
4e766ff28d
Add fast-path for GPOS too
...
Shaves another 3% for DejaVu Sans long Latin strings.
2012-06-09 02:53:57 -04:00
Behdad Esfahbod
993c51915f
Add fast-path to GSUB to check coverage
...
Shaves a good 10% off DejaVu Sans with simple Latin text for me.
Now, DejaVu is very ChainContext-intensive, but it's also a very
popular font!
2012-06-09 02:48:16 -04:00
Behdad Esfahbod
f19e0b0099
Match input before backtrack
...
Makes more sense, optimization-wise.
2012-06-09 02:26:57 -04:00
Behdad Esfahbod
67bb9e8cea
Add set add_coverage() to Coverage()
2012-06-09 02:02:46 -04:00
Behdad Esfahbod
4952f0aa5b
Minor
2012-06-09 01:43:20 -04:00
Behdad Esfahbod
ad6a6f2240
Minor
2012-06-09 01:43:20 -04:00
Behdad Esfahbod
46617a4213
Fix cache implementation
2012-06-09 01:43:20 -04:00
Behdad Esfahbod
ce47613889
Micro-optimize
...
I know...
2012-06-09 01:43:15 -04:00
Behdad Esfahbod
70416de298
Minor
2012-06-09 00:56:41 -04:00
Behdad Esfahbod
99159e52a3
Use linear search for small counts
...
I see about 8% speedup with long strings with DejaVu Sans.
2012-06-09 00:50:40 -04:00
Behdad Esfahbod
caf0412690
Minor
2012-06-09 00:26:32 -04:00
Behdad Esfahbod
0f8fea71a6
Minor. Hide _hb_ot_layout_get_glyph_property()
2012-06-09 00:24:38 -04:00
Behdad Esfahbod
44b8ee0c90
Minor
2012-06-09 00:23:24 -04:00
Behdad Esfahbod
7b84c536c1
In MarkBase attachment, only attach to first of a MultipleSubst sequence
...
This is apparently what Uniscribe does. Test case is:
SEEN FATHA TEH ALEF
with Arabic Typesetting. Originally reported by Khaled Hosny.
2012-06-08 22:04:23 -04:00
Behdad Esfahbod
ec57e0c565
Set lig_comp for MultipleSubst components
...
To be used for correct mark attachment to first component of a
MultipleSubst output. That's what Uniscribe does.
2012-06-08 21:47:23 -04:00
Behdad Esfahbod
e085fcf7ca
Remove unused buffer->replace_glyphs_be16
2012-06-08 21:45:00 -04:00
Behdad Esfahbod
3ec77d6ae0
Don't use replace_glyphs_be for MultipleSubst
2012-06-08 21:44:06 -04:00
Behdad Esfahbod
4b7192125f
Minor
2012-06-08 21:41:46 -04:00
Behdad Esfahbod
4508789f4b
Add test for static initializers and other C++ stuff
2012-06-08 21:32:43 -04:00
Behdad Esfahbod
56bd259b9a
Minor
2012-06-08 21:29:18 -04:00
Behdad Esfahbod
bc8357ea7b
Merge clusters during normalization
2012-06-08 21:01:20 -04:00
Behdad Esfahbod
fe3dabc08d
Minor
2012-06-08 20:56:05 -04:00
Behdad Esfahbod
e88e14421a
Use merge_clusters instead of open-coding
2012-06-08 20:55:21 -04:00
Behdad Esfahbod
330a2af3ff
Use merge_clusters when forming Unicode clusters
2012-06-08 20:40:02 -04:00
Behdad Esfahbod
bd300df9ad
Minor
2012-06-08 20:36:37 -04:00
Behdad Esfahbod
e51d2b6ed1
Extend into main buffer if extension hit end of out-buffer merging clusters
2012-06-08 20:36:33 -04:00
Behdad Esfahbod
5ced012d9f
Extend end when merging clusters in out-buffer
2012-06-08 20:31:32 -04:00
Behdad Esfahbod
72c0a18783
Extend clusters backward in out-buffer
2012-06-08 20:30:03 -04:00
Behdad Esfahbod
cd5891493d
Extend clusters backwards, into the out-buffer too
2012-06-08 20:28:59 -04:00
Behdad Esfahbod
77471e0371
Clear output buffer before calling GSUB pause functions
2012-06-08 20:21:02 -04:00
Behdad Esfahbod
cafa6f3727
When merging clusters, extend the end
2012-06-08 20:17:10 -04:00
Behdad Esfahbod
28ce5fa454
Merge clusters when ligating
2012-06-08 20:17:06 -04:00
Behdad Esfahbod
2bb1761ccb
Minor, use next_glyph()
2012-06-08 19:29:44 -04:00
Behdad Esfahbod
5f68f8675e
Minor
2012-06-08 19:23:43 -04:00
Behdad Esfahbod
8729691267
Increase Uniscribe MAX_ITEMS
2012-06-08 14:39:31 -04:00
Behdad Esfahbod
dbffa4c83d
Fix Uniscribe charset matching
...
Previously was failing to match fonts that didn't support CHARSET_ANSI.
There still remains a problem with the Uniscribe backend, in that if a
font with the same family name is installed, and is newer, the native
one is preferred over the font we provide. Fixing it requires rewriting
the name table with a unique family name...
2012-06-08 14:39:31 -04:00
Behdad Esfahbod
82e8bd8628
Remove unused code
2012-06-08 14:39:31 -04:00
Behdad Esfahbod
6da9dbff21
Remove zero-width chars in the fallback shaper too
2012-06-08 10:53:35 -04:00
Behdad Esfahbod
68b76121f8
Fix regressions introduced by sed. Ouch!
...
Introduced in 99c2695759
.
Broken mark-mark and mark-ligature stuff.
2012-06-08 10:47:00 -04:00
Behdad Esfahbod
0dd86f9f68
Whitespace
2012-06-08 10:23:03 -04:00
Behdad Esfahbod
8e7beba7c3
Fix Uniscribe clusters with direction-overriden Arabic
2012-06-08 10:22:06 -04:00
Behdad Esfahbod
b069c3c31b
Really fix override-direction in Uniscribe
2012-06-08 10:10:29 -04:00
Behdad Esfahbod
fcd6f53261
Unbreak Uniscribe
...
Oops. hb_tag_t and OPENTYPE_TAG have different endianness. Perhaps
something to add API for in hb-uniscribe.h
2012-06-08 09:59:43 -04:00
Behdad Esfahbod
29eac8f591
Override direction in Uniscribe backend
...
Matches OT backend now.
2012-06-08 09:26:17 -04:00
Behdad Esfahbod
1c1233e576
Make Uniscribe backend respect selected script
2012-06-08 09:20:53 -04:00
Behdad Esfahbod
0bb0f5d419
Add note re _NullPool
2012-06-07 17:42:48 -04:00
Behdad Esfahbod
2a3d911fe0
Fix alignment-requirement missmatch
...
Detected by clang and lots of cmdline options.
2012-06-07 17:31:46 -04:00
Behdad Esfahbod
6095de1635
Fix clang warning with NO_MT path
2012-06-07 15:48:18 -04:00
Behdad Esfahbod
a18280a8ce
Fix warnings produced by clang analyzer
2012-06-07 15:44:12 -04:00
Behdad Esfahbod
73cb02de2d
Minor
2012-06-06 11:29:25 -04:00
Behdad Esfahbod
79e2b4791f
Fix ASSERT_POD on clang
...
As reported by bashi. Not tested.
2012-06-06 11:27:17 -04:00
Behdad Esfahbod
6220e5fc0d
Add ASSERT_POD for most objects
2012-06-06 03:30:09 -04:00
Behdad Esfahbod
a00a63b5ef
Add macros to check that types are POD
2012-06-06 03:07:01 -04:00
Behdad Esfahbod
61eb60c129
Don't link to libstdc++
...
New try.
2012-06-05 21:22:36 -04:00
Behdad Esfahbod
81a4b9fd4e
Remove unused hb_static_mutex_t
2012-06-05 20:53:00 -04:00
Behdad Esfahbod
4a3a9897b3
Disable Intel atomic ops on mingw32
...
Apparently the configure test is not enough...
2012-06-05 20:39:07 -04:00
Behdad Esfahbod
0594a24484
Cleanup TRUE/FALSE vs true/false
2012-06-05 20:35:40 -04:00
Behdad Esfahbod
e1ac38f8dd
Fix inert buffer set_length() with zero
...
Oops!
2012-06-05 20:31:49 -04:00
Behdad Esfahbod
04bc1eebe7
Add configure tests for Intel atomic intrinsics
2012-06-05 20:16:56 -04:00
Behdad Esfahbod
f64b2ebf82
Remove last static initializer
...
We're free! Lazy or immediate...
2012-06-05 20:15:27 -04:00
Behdad Esfahbod
04aed572f1
Make hb-ft static-initializer free
2012-06-05 18:45:36 -04:00
Behdad Esfahbod
be4560a3b5
Undo default unicode-funcs to avoid static initializer again
2012-06-05 18:43:57 -04:00
Behdad Esfahbod
093171ccec
Implement lock-free hb_language_t
...
Another static-initialization down. One more to go.
2012-06-05 18:00:45 -04:00
Behdad Esfahbod
6843ce01be
Add atomic-pointer functions
...
Gonig to use these for lock-free linked-lists, to be used for
hb_language_t among other things.
2012-06-05 17:27:20 -04:00
Behdad Esfahbod
cdafe3a7d8
Add gcc intrinsics implementations for atomic and mutex
2012-06-05 16:40:23 -04:00
Behdad Esfahbod
d970d2899b
Add gcc implementation for atomic ops
2012-06-05 16:06:28 -04:00
Behdad Esfahbod
0e253e97af
Add a mutex to object header
...
Removes one more static-initialization. A few more to go.
2012-06-05 15:54:43 -04:00
Behdad Esfahbod
a2b471df82
Remove static initializers from indic
2012-06-05 15:17:44 -04:00
Behdad Esfahbod
f06ab8a426
Better hide nil objects and make them const
2012-06-05 14:49:14 -04:00
Behdad Esfahbod
bf93b636c4
Remove constructor from hb_prealloced_array_t
...
This was causing all object types to be non-POD and have static
initializers. We don't need that!
Now, most nil objects just moved from .bss to .data. Fixing for that
coming soon.
2012-06-05 14:17:32 -04:00
Behdad Esfahbod
f1971a2174
Fix warnings
2012-06-05 14:06:04 -04:00
Behdad Esfahbod
9fc7a11469
Remove comma at the end of enum
...
As reported by Jonathan Kew on the list.
2012-06-04 08:28:19 -04:00
Behdad Esfahbod
3b8fd9c48f
Remove const from ref_count.ref_count
...
According to Tom Hacohen this was breaking build with some compilers.
In file included from hb-buffer-private.hh:35:0,
from hb-ot-map-private.hh:32,
from hb-ot-shape-private.hh:32,
from hb-ot-shape.cc:29:
hb-object-private.hh: In constructor '_hb_object_header_t::_hb_object_header_t()':
hb-object-private.hh:97:8: error: uninitialized const member in 'struct hb_reference_count_t'
hb-object-private.hh:51:25: note: 'hb_reference_count_t::ref_count' should be initialized
In file included from hb-ot-shape.cc:33:0:
hb-set-private.hh: In constructor '_hb_set_t::_hb_set_t()':
hb-set-private.hh:37:8: note: synthesized method '_hb_object_header_t::_hb_object_header_t()' first required here
hb-ot-shape.cc: In function 'void hb_ot_shape_glyphs_closure(hb_font_t*, hb_buffer_t*, const hb_feature_t*, unsigned int, hb_set_t*)':
hb-ot-shape.cc:521:12: note: synthesized method '_hb_set_t::_hb_set_t()' first required here
2012-06-03 15:54:19 -04:00
Behdad Esfahbod
70600dbf62
Minor
2012-06-03 15:52:51 -04:00
Behdad Esfahbod
96a9ef0c9f
Remove tab character like other "zero-width" characters
...
Uniscribe does that, this make comparing results to Uniscribe
easier.
2012-06-01 13:46:26 -04:00
Behdad Esfahbod
0558d55bac
Remove hb_atomic_int_set/get()
...
We never use them in fact...
I'm just adjusting these as I better understand the requirements of
the code and the guarantees of each operation.
2012-05-28 10:46:47 -04:00
Behdad Esfahbod
bce095524b
Add hb_font_get_glyph_name() and hb_font_get_glyph_from_name()
2012-05-28 10:45:50 -04:00
Behdad Esfahbod
bc145658bd
Warn if no Unicode functions implementation is found
2012-05-28 10:45:50 -04:00
Behdad Esfahbod
a3547330fa
Cleanup atomic ops on OS X
2012-05-27 10:20:47 -04:00
Behdad Esfahbod
e4b6d503c5
Don't use atomic ops in hb_cache_t
...
We don't care about linearizability, so unprotected int read/write
are enough, no need for expensive memory barriers. It's a cache,
that's all.
2012-05-27 10:11:13 -04:00
Behdad Esfahbod
819faa0530
Minor
2012-05-27 10:09:18 -04:00
Behdad Esfahbod
303d5850ec
Fix Windows atomic get/set
...
According to:
http://msdn.microsoft.com/en-us/library/65tt87y8.aspx
MemoryBarrier() is the right macro to protect these, not _ReadBarrier()
and/or _WriteBarrier().
2012-05-27 10:01:13 -04:00
Behdad Esfahbod
29ce446d31
Add set iterator
2012-05-25 14:17:54 -04:00
Behdad Esfahbod
62c3e111fc
Add set symmetric difference
2012-05-25 13:48:00 -04:00
Behdad Esfahbod
27aba594c9
Minor
2012-05-24 15:00:01 -04:00
Behdad Esfahbod
cde1c0114b
Fix hb_atomic_int_set() implementation for HB_NO_MT
...
As pointed out by Jonathan Kew.
2012-05-24 10:46:39 -04:00
Behdad Esfahbod
ed2f1363a3
Fix substitution glyph class propagation
...
The old code was doing nothing.
Still got to find an example font+string that makes this matter, but
need this for fixing synthetic GDEF anyway.
2012-05-22 22:12:22 -04:00
Behdad Esfahbod
20fdb0f41d
Add a lock-free cache type for int->int functions
...
To be used for cmap and advance caching if desired.
2012-05-17 22:04:45 -04:00
Behdad Esfahbod
bd908b4f10
Implement hb_atomic_int_set() for OS X
2012-05-17 22:02:08 -04:00
Behdad Esfahbod
022a05ae90
Minor
2012-05-17 21:53:24 -04:00
Behdad Esfahbod
22afd66a30
Add hb_atomic_int_set() again
2012-05-17 21:23:49 -04:00
Behdad Esfahbod
4aa7258cb1
Fix type conflicts on Windows without glib
2012-05-17 21:01:04 -04:00
Behdad Esfahbod
f039e79d54
Don't use min/max as function names
...
They can be macros on some systems. Eg. mingw32.
2012-05-17 20:55:12 -04:00
Behdad Esfahbod
34961e3198
Prefer native atomic/mutex ops to glib's
2012-05-17 20:50:38 -04:00
Behdad Esfahbod
ec3ba4b96f
Move atomic ops into their own header
2012-05-17 20:30:46 -04:00
Behdad Esfahbod
1d6846db9e
[Indic] Apply vatu feature after cjct
...
Testing with old Deva spec this reduces failures.
Test sequence: U+0915,U+094D,U+0930.
2012-05-13 18:09:29 +02:00
Behdad Esfahbod
617f4ac46f
Refactor
2012-05-13 16:48:03 +02:00
Behdad Esfahbod
5e4e21fce4
Revert "[Indic] Refactoring"
...
This reverts commit 0831061efb
.
2012-05-13 16:46:08 +02:00
Behdad Esfahbod
3f18236a03
Fix more warnings
2012-05-13 16:20:10 +02:00
Behdad Esfahbod
9f377ed321
Fix more unused-var warnings
2012-05-13 16:13:44 +02:00
Behdad Esfahbod
d993e72331
Fix hb_face_set_index()
2012-05-13 16:04:36 +02:00
Behdad Esfahbod
93345edcbe
Fix warnings
2012-05-13 16:01:08 +02:00
Behdad Esfahbod
eace47b173
Minor
2012-05-13 15:54:43 +02:00
Behdad Esfahbod
99c2695759
Add accessort to buffer for current info, current pos, and prev info
2012-05-13 15:45:18 +02:00
Behdad Esfahbod
6736f3c5b0
Minor
2012-05-13 15:21:06 +02:00
Behdad Esfahbod
5df809b655
[GSUB/GPOS] Remove context_length
...
The spec doesn't say contextual matching should be done this way,
and AOTS doesn't do it either. It was inherited from old HarfBuzz.
Remove it.
2012-05-13 15:17:51 +02:00
Behdad Esfahbod
28b9d502bb
Minor
2012-05-13 15:04:00 +02:00
Behdad Esfahbod
737dded2e0
Fix compiler warnings
2012-05-12 15:40:11 +02:00
Behdad Esfahbod
7f852b644b
Fix compiler warnings
2012-05-11 23:10:31 +02:00
Behdad Esfahbod
f7e8dcfd4f
[Indic] Unbreak Devanagari
...
And this, concludes the HarfBuzz Massala Hackfest.
I like to specially thank Jonathan Kew for doing all the decription and
letting me get commit points.
2012-05-11 22:01:33 +02:00
Behdad Esfahbod
6a091df9b4
[Indic] Disambiguate sub vs post vs above matras
...
Bengali is at *just* above 5% now.
2012-05-11 21:42:27 +02:00
Behdad Esfahbod
9d0d319a4a
[Indic] Position Bengali Reph before matras
2012-05-11 21:36:32 +02:00
Behdad Esfahbod
f893672511
[Indic] Start categorizing Reph per script
2012-05-11 21:10:03 +02:00
Behdad Esfahbod
a913b024d8
[Indic] Apply 'init' feature for Bengali
...
Error down from 20% to 7%.
2012-05-11 20:59:26 +02:00
Behdad Esfahbod
eed903b164
[Indic] Refactor for the arrival of 'init' feature
...
Yep, on Bengali now!
2012-05-11 20:50:53 +02:00
Behdad Esfahbod
18c06e189b
[Indic] Add Uniscribe bug feature for dotted circle
...
For dotted-circle independent clusters, Uniscribe does no Reph shaping
for the exact sequence Ra+Halant+25CC. Which also is the only possible
sequence with 25CC at the end.
2012-05-11 20:02:14 +02:00
Behdad Esfahbod
0831061efb
[Indic] Refactoring
2012-05-11 19:07:58 +02:00
Behdad Esfahbod
7ea58db311
Minor
2012-05-11 18:58:57 +02:00
Behdad Esfahbod
9c09928989
[Indic] Allow multiple Consonants in Vowel/NBSP syllables
...
Uniscribe allows multiple Halant+Consonant after a Vowel.
Tests:
↦ * U+0905,U+094D,U+092B,U+094D,930,94d,930
2012-05-11 18:46:35 +02:00
Behdad Esfahbod
8c0aa486f3
[Indic] Allow two Nuktas per consonant
...
Uniscribe allows up to two nuktas per consonant and one per matra. It does so
indepent of whether the consonant already has a nukta in it. Tests:
* U+0916,U+093C,U+0941
* U+0959,U+093C,U+0941
* U+0916,U+093C,U+093C,U+0941
* U+0959,U+093C,U+093C,U+0941
* U+0916,U+093C,U+093C,U+093C,U+0941
* U+0959,U+093C,U+093C,U+093C,U+0941
* 915,93c,93c,,94d,U+0916,U+093C,U+093C,U+093e,93c,93c
2012-05-11 18:13:42 +02:00
Behdad Esfahbod
3399a06e70
[Indic] Fix U+0952 and similar classification to match Uniscribe
...
See comments.
2012-05-11 17:54:26 +02:00
Behdad Esfahbod
11aa3ef18d
[Indic] Treat U+0951..U+0954 all similar to U+0952
2012-05-11 17:30:48 +02:00
Behdad Esfahbod
5f131d3226
[GSUB/GPOS/Indic] Apply GSUB/GPOS within syllables only
...
This does not apply to the context matchings.
This regresses tests right now. And we are not sure whether this is
the right thing to do for GPOS. But we'll figure out.
2012-05-11 17:29:40 +02:00
Behdad Esfahbod
8fd83aaf6e
[GSUB/GPOS] Fix wrong buffer access in backward skippy mask matching
2012-05-11 17:18:37 +02:00
Behdad Esfahbod
ff24d1081a
[Indic] Don't use syllable serial value 0
2012-05-11 17:07:08 +02:00
Behdad Esfahbod
892eb78782
[Indic] Implement Uniscribe Reph+Matra+Halant bug feature
2012-05-11 16:54:40 +02:00
Behdad Esfahbod
67ea29af49
[Indic] Add example of different Uniscribe behavior
2012-05-11 16:51:23 +02:00
Behdad Esfahbod
ebe29733d4
[Indic] Add runtime Uniscribe bug compatibility mode!
...
Enable by setting envvar:
HB_OT_INDIC_OPTIONS=uniscribe-bug-compatible
Plus, LeftMatra+Halant "feature".
2012-05-11 16:43:12 +02:00
Behdad Esfahbod
616e692e29
[Indic] Add #define UNISCRIBE_BUG_COMPATIBLE 1
2012-05-11 16:25:02 +02:00
Behdad Esfahbod
6782bdae3b
[Indic] Fix Left Matra + Halant reordering
...
As can be seen in: U+092B,U+093F,U+094D
2012-05-11 16:23:43 +02:00
Behdad Esfahbod
3c2ea9481b
Minor
2012-05-11 16:23:38 +02:00
Behdad Esfahbod
203d71069c
[GSUB/GPOS] Check all glyph masks when matching input
2012-05-11 16:01:44 +02:00
Behdad Esfahbod
668c6046c1
[Indic] Apply Reph mask to all POS_REPH glyphs
...
Needed for upcoming changes to GSUB/GPOS mask matching.
2012-05-11 15:34:13 +02:00
Behdad Esfahbod
4be46bade2
[Indic] Fix state machine to backtrack
2012-05-11 14:39:01 +02:00
Behdad Esfahbod
cee7187447
[Indic] Move syllable tracking from Indic to generic layer
...
This is to incorporate it into GSUB/GPOS processing.
2012-05-11 11:41:39 +02:00
Behdad Esfahbod
3bf27a9f0e
[Indic] Disable conjuncts when a ZWJ happens
...
Not that the code makes any difference since the presence of ZWJ itself
causes the ligature to fail to match anyway.
2012-05-11 11:17:23 +02:00
Behdad Esfahbod
c6d904d67d
[Indic] Fix bitops typo!
...
Another 1000 down!
2012-05-11 11:07:40 +02:00
Behdad Esfahbod
55fe2cf79b
Make APPLY debug output print current index and codepoint
...
Yay!
2012-05-11 03:56:33 +02:00
Behdad Esfahbod
7bd2b04fea
Minor
2012-05-11 03:40:58 +02:00
Behdad Esfahbod
cf26510dbb
Some more...
...
Done. I promise.
2012-05-11 03:35:08 +02:00
Behdad Esfahbod
9659523ca3
More beauty in debug output!
2012-05-11 03:33:36 +02:00
Behdad Esfahbod
cf26e88a5a
Finish off debug output beautification
2012-05-11 03:16:57 +02:00
Behdad Esfahbod
d7bba01a35
Only print class name in debug output if there's one available
2012-05-11 02:46:26 +02:00
Behdad Esfahbod
85f73fa8da
Only printout class name in tracing, if one is available
...
Makes debug output much more pleasant.
2012-05-11 02:40:42 +02:00
Behdad Esfahbod
98619ce4fa
Minor
2012-05-11 02:34:06 +02:00
Behdad Esfahbod
acea183e98
Add return annotation for APPLY
2012-05-11 02:33:11 +02:00
Behdad Esfahbod
5ccfe8e215
/Minor/
2012-05-11 02:19:41 +02:00
Behdad Esfahbod
0ab8c86217
Annotate SANITIZE return values
...
More to come, for APPLY, CLOSURE, etc.
2012-05-11 02:11:52 +02:00
Behdad Esfahbod
829e814ff3
Minor
2012-05-11 00:52:16 +02:00
Behdad Esfahbod
6eec6f406d
Code reshuffling
2012-05-11 00:50:38 +02:00
Behdad Esfahbod
1e08830b4f
Beautify debug output
2012-05-11 00:43:57 +02:00
Behdad Esfahbod
6f45538017
More massaging trace messaging
2012-05-10 23:24:43 +02:00
Behdad Esfahbod
b5fa37cb69
Minor
2012-05-10 23:09:48 +02:00
Behdad Esfahbod
208109703c
Better trace message support infrastructure
...
We have varargs in the trace interface now. To be used soon...
2012-05-10 23:06:58 +02:00
Behdad Esfahbod
02b2922fbf
[Indic] Towards better Reph positioning
...
Fixed for Deva cases with two full-form consonants. Failures **way** down.
Not much left to go :-).
2012-05-10 21:44:50 +02:00
Behdad Esfahbod
74e54cf446
[Indic] Add Ra back for scripts without Reph
...
We now check that the 'rphp' table exists before forming Reph, so
we don't need to comment out Ra for those scripts.
2012-05-10 21:22:58 +02:00
Behdad Esfahbod
2b70df5cc0
[Indic] Add note re Uniscribe clusters
2012-05-10 18:38:22 +02:00
Behdad Esfahbod
21d2803133
[Indic] Do clustering like Uniscribe does
...
Hindi Wikipedia failures down to 6639 (0.938381%)!
2012-05-10 18:34:34 +02:00
Behdad Esfahbod
8df5636968
[Indic] Reorder Reph to before the Halant after Matras
...
Uniscribe doesn't do it, but we want to do as it gives the Reph the
opportunity to interact with the Matras. Test with mangal for example.
Sequence: <0930,094d,0915,094b,094d>
In test suite already.
2012-05-10 15:41:04 +02:00
Behdad Esfahbod
daf3234bdc
[Indic] Don't clear the mask for Reph
...
This was removing the mandatory global 1 bit in the mask and hence
disabling GPOS for Reph!
2012-05-10 15:28:27 +02:00
Behdad Esfahbod
7708ee23cb
[Indic] Improve Left Matra repositioning
...
Move its dependents too.
2012-05-10 14:48:25 +02:00
Behdad Esfahbod
dbb105883c
[Indic] Do Reph repositioning in final reordering like the spec says
...
This introduced a failure, which we tracked down to a test case like this:
U+092E,U+094B,U+094D,U+0930
The final character is a Ra that should be put in a syllable of it's
own. And we do. But it will interact with the Halant before it. So
now we finally are convinced that we have to limit features to syllable
boundaries. That's coming after lunch!
2012-05-10 13:45:52 +02:00
Behdad Esfahbod
4705a70269
Minor
2012-05-10 13:09:08 +02:00
Behdad Esfahbod
4ac9e98d9d
[Indic] Reorder left matras to be closer to base
2012-05-10 12:53:53 +02:00
Behdad Esfahbod
1a1fa8c655
[Indic] Treat the standalone cluster case reusing the consonant logic
2012-05-10 12:21:30 +02:00
Behdad Esfahbod
190eb31a16
[Indic] Minor
2012-05-10 12:21:30 +02:00
Behdad Esfahbod
c5306b6861
[Indic] Handle Vowel syllables
...
Reusing the consonant logic!
2012-05-10 12:21:30 +02:00
Behdad Esfahbod
6d8e0cb74c
[Indic] Simplify Reph logic
2012-05-10 11:41:51 +02:00
Behdad Esfahbod
3d25079f8d
[Indic] Don't form Reph is Ra is the only consonant in the syllable
2012-05-10 11:37:42 +02:00
Behdad Esfahbod
b99d63ae11
[Indic] Increase max syllable length
...
20 was way too low, one could hit a syllable with 7ish consonants with it.
2012-05-10 11:32:52 +02:00
Behdad Esfahbod
a391ff50b9
[Indic] Adjust base after sorting
2012-05-10 11:31:20 +02:00
Behdad Esfahbod
d3637edb24
[Indic] Don't return for long syllables. Just not sort.
2012-05-10 10:51:38 +02:00
Behdad Esfahbod
dfa0cade7f
Fix Uniscribe clusters with multiple items
2012-05-09 19:10:07 +02:00
Behdad Esfahbod
86e5dd386a
[Indic] Don't give up syllable parsing upon junk
2012-05-09 18:57:37 +02:00
Behdad Esfahbod
ef24cc8c8e
[Indic] Towards multi-cluster syllables and final reordering
2012-05-09 18:10:20 +02:00
Behdad Esfahbod
a9844d41c6
Combine lig_id and lig_comp into one byte, to free up one for Indic
2012-05-09 17:53:13 +02:00
Behdad Esfahbod
92332e5116
Minor
2012-05-09 17:40:00 +02:00
Behdad Esfahbod
dbccf87eef
[Indic] Make room for more reordering positions
2012-05-09 17:24:39 +02:00
Behdad Esfahbod
d4480ace7f
[Indic] Improve matra vs consonant ordering
...
Another 1.5% down.
2012-05-09 15:59:47 +02:00
Behdad Esfahbod
33c92e7695
[Indic] Categorize Anudatta
2012-05-09 15:41:51 +02:00
Behdad Esfahbod
19d984edaa
[Indic] Make sure Reph jumps over all matras to the right
...
Another 12 thousand failures gone! (78 to go)
2012-05-09 15:21:13 +02:00
Behdad Esfahbod
9034641333
[Indic] Keep Vedic signs at the right too
2012-05-09 15:04:58 +02:00
Behdad Esfahbod
d1deaa2f5b
Replace zerowidth invisible chars with a zero-advance space glyph
...
Like Uniscribe does.
2012-05-09 15:04:13 +02:00
Behdad Esfahbod
49e5da1591
[indic] Keep the syllable modifier marks to the right
...
Shaping failures on Hindi Wikipedia go down from 25% to 14%!
2012-05-09 13:23:27 +02:00
Behdad Esfahbod
5b12609093
Minor
2012-05-09 12:37:27 +02:00
Behdad Esfahbod
9ce939232b
Minor
2012-05-09 12:03:09 +02:00
Behdad Esfahbod
76b3409de6
[indic] Better Reph matching
2012-05-09 11:52:32 +02:00
Behdad Esfahbod
df6d45c693
Minor
2012-05-09 11:38:31 +02:00
Behdad Esfahbod
412b91889d
[indic] Apply Indic features in order
2012-05-09 11:07:18 +02:00
Behdad Esfahbod
1ac075b227
[indic] Apply rakaar forms
...
Fixes 10% of the failures against all of Hindi Wikipedia!
2012-05-09 11:06:47 +02:00
Behdad Esfahbod
1a2a4a0078
Fix warning and build issues
...
As reported by Jonathan Kew on the list.
2012-05-05 22:38:20 +02:00
Behdad Esfahbod
a5e39fed85
Minor
2012-04-25 00:14:46 -04:00
Behdad Esfahbod
1827dc208c
Add hb_ot_shape_glyphs_closure()
...
Experimental API for now.
2012-04-24 16:56:37 -04:00
Behdad Esfahbod
bb09f0ec10
Minor
2012-04-24 16:02:12 -04:00
Behdad Esfahbod
29a7e306e3
Minor
2012-04-24 16:01:30 -04:00
Behdad Esfahbod
6c6ccaf575
Add a few more set operations
...
TODO: Tests for hb_set_t.
2012-04-24 14:23:01 -04:00
Behdad Esfahbod
5caece67ab
Make closure() return void
2012-04-23 23:03:12 -04:00
Behdad Esfahbod
0b08adb353
Add hb_set_t
2012-04-23 22:44:59 -04:00