Behdad Esfahbod
b1914b8bd0
Fix warnings
2012-08-07 16:57:48 -04:00
Behdad Esfahbod
0f8881d6bb
More refactoring
2012-08-07 16:57:02 -04:00
Behdad Esfahbod
428dfcab66
Minor refactoring
2012-08-07 16:51:48 -04:00
Behdad Esfahbod
61f41849af
Add Hebrew presentation forms shaping
...
Lifted from https://bugzilla.mozilla.org/show_bug.cgi?id=728866
2012-08-07 16:45:27 -04:00
Behdad Esfahbod
32d71dc133
[Graphite] Minor
2012-08-07 14:21:12 -04:00
Behdad Esfahbod
030ac5022e
Remove enum trailing comma
...
...again.
2012-08-07 13:01:12 -04:00
Behdad Esfahbod
368b4e7649
Minor
2012-08-06 23:06:04 -04:00
Behdad Esfahbod
ade7459ea7
[util] Fix leaks
2012-08-06 19:49:42 -07:00
Behdad Esfahbod
2fef993460
[Graphite] Fix graphite2 backend with RTL text
...
Patch from Martin Hosken.
2012-08-06 19:35:04 -07:00
Behdad Esfahbod
e4992e13e1
[Graphite] Port graphite2 backend to new shaper infrastructure
2012-08-06 19:29:53 -07:00
Behdad Esfahbod
66591ececf
Remove unnecessary lifecycle bits
...
We already set recount to INVALID when destroying.
This block was not necessary.
2012-08-06 17:07:19 -07:00
Behdad Esfahbod
167b625d98
[Indic] Minor, move 'blwf' after 'half'
...
We don't apply them together anyway. Should not make any difference
right now.
2012-08-05 21:16:26 -07:00
Behdad Esfahbod
048e3b596f
Speed up hb_set_digest_lowest_bits_t calcs
2012-08-04 20:46:45 -07:00
Behdad Esfahbod
3d1b66a35e
Speed up hb_set_digest_common_bits_t calcs
2012-08-04 17:42:28 -07:00
Behdad Esfahbod
25326c2359
Rewrite ARRAY_LENGTH as a template function
...
Such it wouldn't apply to pointers accidentally.
2012-08-04 16:43:18 -07:00
Behdad Esfahbod
8ba8042821
[Indic] Fix consonant position font lookup logic
...
Oops. I broken this badly and the test suite did not notice. That
worries me. Have to investigate.
2012-08-03 18:54:54 -07:00
Behdad Esfahbod
abd0c05f1f
Minor
2012-08-03 18:45:05 -07:00
Behdad Esfahbod
46ee108ef8
Fix leak
2012-08-03 18:21:13 -07:00
Behdad Esfahbod
71baea0062
[OT] Use general-category, not GDEF class, to decide to zero mark advances
...
At this point, the GDEF glyph synthesis looks pointless. Not that I
have many fonts without GDEF lying around.
As for mark advance zeroing when GPOS not available, that also is being
replaced by proper fallback mark positioning soon.
2012-08-03 17:40:07 -07:00
Behdad Esfahbod
3a7e137a68
Dn't use gint
2012-08-03 17:23:40 -07:00
Behdad Esfahbod
11b0e20ba4
[Indic] Add per-script configuration tables
...
This concludes the Indic shape_plan work. May do for Arabic also...
2012-08-02 14:21:40 -04:00
Behdad Esfahbod
85fc6c483f
[Indic] Move more stuff to the shape_plan
...
Almost done. Need to add per-script static tables.
2012-08-02 12:21:44 -04:00
Behdad Esfahbod
914ffaa40f
[Indic] Move more repeated work into shape_plan
2012-08-02 11:05:32 -04:00
Behdad Esfahbod
a8c6da90f4
[OT] Add per-complex-shaper shape_plan data
...
Hookup some Indic data to it. More to come.
2012-08-02 10:46:34 -04:00
Behdad Esfahbod
8bb5deba96
[OT] Pipe shape_plan down to pause_callbacks
2012-08-02 10:07:58 -04:00
Behdad Esfahbod
3e38c0f288
More massaging
2012-08-02 09:44:18 -04:00
Behdad Esfahbod
16c6a27b4b
[OT] Port complex_shaper to planner/plan
2012-08-02 09:38:28 -04:00
Behdad Esfahbod
5393e3a62b
[OT] Minor refactoring
2012-08-02 09:24:35 -04:00
Behdad Esfahbod
24eacf17c8
[Indic] Move consonant-position-setting into initial_reordering()
2012-08-02 08:42:51 -04:00
Behdad Esfahbod
afbcc24be0
[GSUB] Wire the font, not just the face, down to substitute()
...
We need the font for glyph lookup during GSUB pauses in Indic shaper.
Could perhaps be avoided, but at this point, we don't mean to support
separate substitute()/position() entry points (anymore), so there is
no point in not providing the font to GSUB.
2012-08-02 08:36:40 -04:00
Behdad Esfahbod
b0e6a26a10
[OT] Hide some API
...
It was impossible to meaningfully use them from the outside these days.
2012-08-02 08:11:14 -04:00
Behdad Esfahbod
305246744e
Minor
2012-08-02 08:08:04 -04:00
Behdad Esfahbod
8ef3d53255
[Indic] More refactoring of consonant position peeking in the font
...
To be moved to initial_reordering next...
2012-08-02 07:59:19 -04:00
Behdad Esfahbod
3eb6f81fd3
[Indic] Refactor
...
Move all the logic that needs to eventually move into the indic table
into hb-ot-shape-complex-indic-private.hh.
2012-08-02 07:38:39 -04:00
Behdad Esfahbod
3614ba242f
[Indic] Rename
2012-08-02 07:23:42 -04:00
Behdad Esfahbod
610e5e8f71
[Indic] Streamline feature would_apply()
...
Comes with some 10% speedup for Devanagari even!
2012-08-02 05:41:18 -04:00
Behdad Esfahbod
1d002048d5
[Indic] Minor
2012-08-02 05:02:53 -04:00
Behdad Esfahbod
6f76113755
[GSUB/GPOS] Check array size before accessing digests
2012-08-02 04:00:31 -04:00
Behdad Esfahbod
22148b8c4a
Use Coverage digests in would_apply
2012-08-02 03:51:51 -04:00
Behdad Esfahbod
6c459c8fef
Minor
2012-08-02 03:45:53 -04:00
Behdad Esfahbod
e2b8d75fa6
Use wider set digests on 64-bit archs
2012-08-01 22:17:48 -04:00
Behdad Esfahbod
0120ce9679
[GSUB/GPOS] Remove unused get_coverage() methods
2012-08-01 21:56:35 -04:00
Behdad Esfahbod
1336ecdf8e
[GSUB/GPOS] Use Coverage digests as gatekeeper
...
Gives me a good 10% speedup for the Devanagari test case. Less so
for less lookup-intensive tests.
For the Devanagari test case, the false positive rate of the GSUB digest
is 4%.
2012-08-01 21:46:36 -04:00
Behdad Esfahbod
a878c58a8f
[GSUB/GPOS] Add add_coverage()
2012-08-01 21:46:19 -04:00
Behdad Esfahbod
60a3035ac5
Add hb_set_digest_t
...
Implement two set digests, and one that combines the two.
2012-08-01 21:46:19 -04:00
Behdad Esfahbod
c8accf1dd2
[OT] Templatize Coverage::add_coverage()
2012-08-01 21:05:57 -04:00
Behdad Esfahbod
8fbfda920e
Inline font getters
2012-08-01 19:03:46 -04:00
Behdad Esfahbod
6adf417bc1
Use a lookup table for modified_combining_class
2012-08-01 18:07:42 -04:00
Behdad Esfahbod
208f70f055
Inline Unicode callbacks internally
2012-08-01 17:13:10 -04:00
Behdad Esfahbod
7470315a3e
Move unicode accessors around
2012-08-01 17:01:59 -04:00
Behdad Esfahbod
21fdcee001
Add hb_unicode_combining_class_t
2012-08-01 16:28:50 -04:00
Behdad Esfahbod
84186a6400
Add commentary on the compatibility decomposition in the normalizer
2012-08-01 13:32:39 -04:00
Behdad Esfahbod
0834d95201
[hb-old] Adjust mark positioning parameters
...
Fallback mark positioning works now... With hb-ft and hb-view /
hb-shape at least.
2012-08-01 00:21:09 -04:00
Behdad Esfahbod
4ca743dfb8
[old] Implement fontMetrics
2012-08-01 00:03:41 -04:00
Behdad Esfahbod
1e7d860613
[GPOS] Adjust mark advance-width zeroing logic
...
If there is no GPOS, zero mark advances.
If there *is* GPOS and the shaper requests so, zero mark advances for
attached marks.
Fixes regression with Tibetan, where the font has GPOS, and marks a
glyph as mark where it shouldn't get zero advance.
2012-07-31 23:41:06 -04:00
Behdad Esfahbod
a8842e4a44
Remove some TODO items
2012-07-31 23:17:23 -04:00
Behdad Esfahbod
2bc3b9a616
[OT] Zero mark advances if the shaper desires so
...
Enabled for all shapers except for Indic.
2012-07-31 23:17:22 -04:00
Behdad Esfahbod
5fecd8b035
[OT] Synthesize glyph classes
2012-07-31 23:17:22 -04:00
Behdad Esfahbod
03b09214c0
[GSUB] Minor
2012-07-31 22:43:58 -04:00
Behdad Esfahbod
f0fc1df8fc
[hb-old] Implement getGlyphMetrics()
...
Still working on it.
2012-07-31 22:43:32 -04:00
Behdad Esfahbod
378d279bbf
Implement Unicode compatibility decompositions
...
Based on patch from Philip Withnall.
https://bugs.freedesktop.org/show_bug.cgi?id=41095
2012-07-31 21:36:16 -04:00
Behdad Esfahbod
321ec29cc2
Remove unused function
2012-07-31 21:10:16 -04:00
Behdad Esfahbod
69cc492dc1
[buffer] Minor
2012-07-31 14:51:36 -04:00
Behdad Esfahbod
693918ef85
[OT] Streamline complex shaper enumeration
...
Add a shaper class struct.
2012-07-30 21:08:51 -04:00
Behdad Esfahbod
c2e42c3db6
Minor
2012-07-30 19:54:50 -04:00
Behdad Esfahbod
03f67bc012
More refactoring glyph class access
2012-07-30 19:47:53 -04:00
Behdad Esfahbod
300c7307eb
[OT] Don't crash if no GDEF available
2012-07-30 19:37:44 -04:00
Behdad Esfahbod
3dcbdc2125
Minor
2012-07-30 19:32:42 -04:00
Behdad Esfahbod
05bd1b6342
[GSUB/GPOS] Move glyph props matching around
2012-07-30 19:30:01 -04:00
Behdad Esfahbod
2fca1426ca
[GSUB] Don't erase glyph classes if GDEF does not have glyph classes
2012-07-30 18:46:41 -04:00
Behdad Esfahbod
fd42257f8c
Minor
2012-07-30 18:44:10 -04:00
Behdad Esfahbod
7fbbf86efe
[GSUB] Minor
2012-07-30 18:36:42 -04:00
Behdad Esfahbod
713914d320
[Uniscribe] Clean up a bit
2012-07-30 17:54:38 -04:00
Behdad Esfahbod
301168dae7
[CoreText] Port to shape_plan infrastructure
2012-07-30 17:48:04 -04:00
Behdad Esfahbod
6cdfd14bb1
Fix build on Mac
2012-07-30 17:22:17 -04:00
Behdad Esfahbod
7e34601ded
Unbreak Hangul jamo composition
...
When we removed the separate Hangul shaper, the specific normalization
preference of Hangul was lost. Fix that. Also, the Thai shaper was
copied from Hangul, so had the fully-composed normalization behavior,
which was unnecessary. So, fix that too.
2012-07-30 14:53:41 -04:00
Behdad Esfahbod
7afb14407e
[Indic] Recategorize Telugu length marks
...
Fixes 8 more Telugu tests. Failures at 15 (0.00154548%).
2012-07-30 13:54:46 -04:00
Behdad Esfahbod
f2377155e3
[hb-old] Fix misc leaks
...
Backport (forward-port?!) from upstream:
commit 3ab7b37bdebf0f8773493a1fee910b151c4de30f
Author: Behdad Esfahbod <behdad@behdad.org>
Date: Mon Jul 30 10:50:22 2012 -0400
Fix misc leaks
https://bugs.freedesktop.org/show_bug.cgi?id=31992
https://bugs.freedesktop.org/show_bug.cgi?id=31993
https://bugs.freedesktop.org/show_bug.cgi?id=31994
https://bugs.freedesktop.org/show_bug.cgi?id=31995
2012-07-30 10:50:57 -04:00
Behdad Esfahbod
3f4764bb56
Don't lock user_data set during destruction if empty
2012-07-30 10:06:42 -04:00
Behdad Esfahbod
4ba647eecf
Fix leak
2012-07-30 09:53:06 -04:00
Behdad Esfahbod
f860366456
[OT] Gain back some lost speed
2012-07-30 03:16:38 -04:00
Behdad Esfahbod
11f4c87d01
[OT] Remove hb_ot_layout_ensure()
...
I didn't like it from the beginning.
2012-07-30 02:36:46 -04:00
Behdad Esfahbod
578e42182b
Minor
2012-07-30 02:35:07 -04:00
Behdad Esfahbod
a973b5ce86
[GSUB] Further adjustments to mark-attachment vs ligation interaction
...
The d1d69ec52e
change broke Kannada badly,
since it was ligating consonants, pushing matra out, and then ligating
with the matra. Adjust for that. See comments.
2012-07-30 01:47:46 -04:00
Behdad Esfahbod
0aef425e25
[GSUB] Minor
2012-07-30 00:55:15 -04:00
Behdad Esfahbod
d1d69ec52e
[GSUB] Don't ligate glyphs attached to different components of ligatures
...
This concludes the mark-attachment vs ligating interaction fixes (for now).
2012-07-30 00:51:47 -04:00
Behdad Esfahbod
4751dec8be
Minor
2012-07-30 00:42:07 -04:00
Behdad Esfahbod
f24bcfbed1
Minor
2012-07-30 00:39:00 -04:00
Behdad Esfahbod
fe20c0f84f
[GSUB] Fix mark component stuff when ligatures form ligatures!
...
See comments.
Fixes https://bugzilla.gnome.org/show_bug.cgi?id=437633
2012-07-30 00:00:59 -04:00
Behdad Esfahbod
2ec3ba46a3
[GSUB/GPOS] Minor
...
Start squeezing more out of lig_id/lig_comp.
2012-07-29 22:16:15 -04:00
Behdad Esfahbod
ef6e9cec33
Fixup bb0e4ba3e9
2012-07-29 21:35:22 -04:00
Behdad Esfahbod
cb3d340631
[GSUB] Don't set new lig_id on mark ligatures
...
If two marks form a ligature, retain their previous lig_id, such that
the mark ligature can attach to ligature components...
Fixes https://bugzilla.gnome.org/show_bug.cgi?id=676343
In fact, I noticed that we should not let ligatures form between glyphs
coming from different components of a previous ligature. For example,
if the sequence is: LAM,SHADDA,LAM,FATHA,HEH, the LAM,LAM,HEH form a
ligature, putting SHADDA and FATHA next to eachother. However, it would
be wrong to ligate them. Uniscribe has this bug also.
2012-07-29 20:37:38 -04:00
Behdad Esfahbod
a15b70a81a
[hb-old] Fix cluster formation in RTL
...
Unlike Uniscribe, hb-old returns glyphs in logical order, so the logic
does not need to duplicated for RTL.
2012-07-29 20:09:22 -04:00
Behdad Esfahbod
8a7e70ef65
[Minor]
2012-07-29 19:56:54 -04:00
Behdad Esfahbod
bb0e4ba3e9
Minor
2012-07-29 17:34:14 -04:00
Behdad Esfahbod
a00ad60bc0
[Uniscribe] Remove hb_uniscribe_font_ensure()
...
Wasn't a huge fan of putting the burden on the user. Just remove it and
do what we've got to do transparently.
2012-07-28 21:16:08 -04:00
Behdad Esfahbod
5d874d566f
[GPOS] Fix mark-to-mark positioning when one of the marks is a ligature
...
This commit: a3313e5400
broke MarkMarkPos
when one of the marks itself is a ligature. That regressed 26 Tibetan
tests (up from zero!). Fix that. Tibetan back to zero.
2012-07-28 21:05:25 -04:00
Behdad Esfahbod
338fe662b5
[GSUB] Minor
2012-07-28 18:53:01 -04:00
Behdad Esfahbod
e6f7479fe3
[GSUB] Simplify would-apply
2012-07-28 18:34:58 -04:00
Behdad Esfahbod
dadede012e
Minor
2012-07-28 18:13:09 -04:00
Behdad Esfahbod
0b99429ead
[GSUB/GPOS] Add get_coverage() and use it to speed up main loop
...
And use it to speed up the hotspot by checking coverage directly in
the main loop, not 10 functions deep in.
Gives me a solid 20% boost with Indic test suite. Less so for less
lookup-intensive scenarios.
Remove the "fast_path" hack from before.
2012-07-28 17:46:35 -04:00
Behdad Esfahbod
30ec9002d8
Reject lookups with no subTable
2012-07-28 17:25:20 -04:00
Behdad Esfahbod
0981068b75
[GSUB/GPOS] Reject Context/ChainContext lookups with zero input
2012-07-28 17:01:59 -04:00
Behdad Esfahbod
2f87cebe10
Implement shape_plan caching
...
Should give us some performance boost.
2012-07-27 04:20:39 -04:00
Behdad Esfahbod
e9eb9503e9
Add default_shaper_list to shape_plan
2012-07-27 03:16:22 -04:00
Behdad Esfahbod
3b7c4e2706
Don't fail choosing shaper on planning failure
...
Shapers have a chance to reject a font in face shaper_data creation.
No need to allow failing during planning.
2012-07-27 03:12:23 -04:00
Behdad Esfahbod
cfe9882610
Add hb_ot_layout_ensure() and hb_uniscribe_font_ensure()
2012-07-27 03:06:30 -04:00
Behdad Esfahbod
c5b668fb92
Choose one shaper per plan
2012-07-27 02:49:39 -04:00
Behdad Esfahbod
e82061e8db
Move ot shaper completely to shape_plan
2012-07-27 02:29:32 -04:00
Behdad Esfahbod
ea278d3895
Partially switch ot shaper to shape_plan
2012-07-27 02:12:28 -04:00
Behdad Esfahbod
b6b7ba1313
Switch old and uniscribe backends to shape_plan
2012-07-27 01:37:18 -04:00
Behdad Esfahbod
c32c096a42
Switch to shape_plan
...
Not optimized yet. Eats babies. And no shaper uses the shape_plan.
2012-07-27 01:13:53 -04:00
Behdad Esfahbod
5b95c148cc
Start implementing shape_plan
2012-07-27 01:02:24 -04:00
Behdad Esfahbod
bd26b4d21f
Minor
2012-07-26 22:18:24 -04:00
Behdad Esfahbod
027857d041
Start adding a unified shaper access infrastructure
...
Add global shape_plan. Unused so far.
2012-07-26 21:14:02 -04:00
Behdad Esfahbod
fa2dfcd560
Fix visibility warnings with MinGW32
2012-07-26 16:06:16 -04:00
Jonathan Kew
ac2085d4b3
[CoreText] Ensure cluster indices in output buffer are non-decreasing.
...
Does not provide Uniscribe-compatible results, but should at least avoid
breaking hb-view due to out-of-order cluster values.
For RTL runs, ensure cluster values are non-increasing (instead of
non-decreasing).
2012-07-26 15:58:45 -04:00
Behdad Esfahbod
441d3bb7de
Minor
2012-07-26 12:01:12 -04:00
Behdad Esfahbod
2e7f223054
[hb-old] Fix Arabic cursive positioning
...
Backporting from upstream:
commit b847f24ce855d24f6822bcd9c0006905e81b94d8
Author: Behdad Esfahbod <behdad@behdad.org>
Date: Wed Jul 25 19:29:16 2012 -0400
[arabic] Fix Arabic cursive positioning
This was clearly broken in testing. Who knows... Fixes for me.
Test with a Nastaleeq font, or with Arabic Typesetting.
Backporting from Chromium.
2012-07-25 19:30:15 -04:00
Behdad Esfahbod
9550a8c4e8
[hb-old] Fixup not-enough-space handling
2012-07-25 19:22:57 -04:00
Behdad Esfahbod
91e721ea86
[hb-old] Fix clusters
...
Unlike its "documentation", hb-old's log_clusters are, well, indeed
logical, not visual. Fixup. Adapted / copied from hb-uniscribe.
2012-07-25 19:20:34 -04:00
Behdad Esfahbod
a3313e5400
[GPOS] Fix MarkMarkPos applied to results of MultipleSubst
...
This was broken as a result of 7b84c536c1
.
As Khaled reported, MarkMark positioning was broken with glyphs
resulting from a MultipleSubst. Fixed. Test with the ALLAH character
in Amiri.
2012-07-25 18:37:51 -04:00
Behdad Esfahbod
35bdab3cf1
Minor
2012-07-25 11:59:52 -04:00
Behdad Esfahbod
8fe4c7405b
[hb-old] Add HarfBuzz.old shaper
...
Choose using shaper name "old".
2012-07-25 11:11:22 -04:00
Behdad Esfahbod
5e1987005e
[hb-old] Define Unicode funcs in terms of new HarfBuzz
2012-07-25 11:11:22 -04:00
Behdad Esfahbod
4a31166b28
[hb-old] Shovel out the line-breaking / word-segmentation stuff
2012-07-25 11:11:22 -04:00
Behdad Esfahbod
0bcbe88cf3
[hb-old] Add visibility attributes
2012-07-25 11:11:22 -04:00
Behdad Esfahbod
6a9d43c317
[hb-old] Remove unused header file
2012-07-25 11:11:22 -04:00
Behdad Esfahbod
fb47209c5b
[hb-old] Rename hb_buffer_* to HB_Buffer_*
2012-07-25 11:11:22 -04:00
Behdad Esfahbod
1512a73575
[hb-old] Start adding HarfBuzz-old as a new backend
2012-07-25 11:11:16 -04:00
Behdad Esfahbod
478fd0529b
Minor
2012-07-24 17:09:01 -04:00
Behdad Esfahbod
8979a7f6f2
[Mongolian] Remove Mongolian Vowel Separator at the end of shaping
...
Results match Uniscribe now.
2012-07-24 17:03:55 -04:00
Jonathan Kew
aa6d849838
[CoreText] Add basic Core Text backend for comparison with our native shaping
...
Does not attempt to handle clusters in a Uniscribe- or HarfBuzz-compatible way;
just returns the original string indexes that CT maintains. These may even be
out-of-order in the case of reordrant glyphs.
2012-07-24 15:52:32 -04:00
Behdad Esfahbod
ec8d249469
Make data members of various OpenType structs protected instead of private
...
Should fix warnings generated when building with -Wunused-private-field.
Based on patch from Jonathan Kew.
2012-07-24 15:40:37 -04:00
Behdad Esfahbod
97aa0b738a
Minor const correctness shuffling
2012-07-24 15:02:34 -04:00
Behdad Esfahbod
6411e74caf
[Indic] Reposition Gurmukhi top matras to after post
...
The font is forming a post-base consonant in some samples, and Uniscribe
positions top matra on the post-base. Do the same.
Gurmukhi failures down from 59 to 41 (0.0674242%).
2012-07-24 13:48:49 -04:00
Behdad Esfahbod
65c43accdc
[Indic] Better position left-matra in Malayalam
...
Just put it before base, which is what's expected.
Malayalam failures down from 1559 to 1197 (0.114172%).
BENGALI: 353988 out of 354285 tests passed. 297 failed (0.0838308%)
DEVANAGARI: 693571 out of 693628 tests passed. 57 failed (0.00821766%)
GUJARATI: 366489 out of 366506 tests passed. 17 failed (0.0046384%)
GURMUKHI: 60750 out of 60809 tests passed. 59 failed (0.0970251%)
KANNADA: 950956 out of 951913 tests passed. 957 failed (0.100534%)
KHMER: 299094 out of 299124 tests passed. 30 failed (0.0100293%)
MALAYALAM: 1047219 out of 1048416 tests passed. 1197 failed (0.114172%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271699 out of 271847 tests passed. 148 failed (0.0544424%)
TAMIL: 1091837 out of 1091837 tests passed. 0 failed (0%)
TELUGU: 970524 out of 970573 tests passed. 49 failed (0.00504856%)
2012-07-24 03:36:47 -04:00
Behdad Esfahbod
88f413b56f
[Indic] Implement Reph+Ya-Phalaa interaction
...
The sequence Ra,H,Ya in Bengali is ambigious and Unicode encoded that to
get Ya-Phalaa, one would place ZWJ before Halant. Ie. a ZWJ,H sequence
requests subjoining, while a H,ZWJ requests Half form. Implement that.
Bengali failures go down from 377 to 297 (0.0838308%).
Gujarati is down by 4 to 17 (0.0046384%).
Kannada is down by 226 to 957 (0.100534%).
Current status:
BENGALI: 353988 out of 354285 tests passed. 297 failed (0.0838308%)
DEVANAGARI: 693571 out of 693628 tests passed. 57 failed (0.00821766%)
GUJARATI: 366489 out of 366506 tests passed. 17 failed (0.0046384%)
GURMUKHI: 60750 out of 60809 tests passed. 59 failed (0.0970251%)
KANNADA: 950956 out of 951913 tests passed. 957 failed (0.100534%)
KHMER: 299094 out of 299124 tests passed. 30 failed (0.0100293%)
MALAYALAM: 1046857 out of 1048416 tests passed. 1559 failed (0.148701%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271699 out of 271847 tests passed. 148 failed (0.0544424%)
TAMIL: 1091837 out of 1091837 tests passed. 0 failed (0%)
TELUGU: 970524 out of 970573 tests passed. 49 failed (0.00504856%)
2012-07-24 03:04:36 -04:00
Behdad Esfahbod
dff0ece11d
[Indic] Limit matras to 4 per syllable
...
Also limit joiners.
This limits our syllable length to a constant, and is
closer to what Uniscribe does anyway.
Two Devanagari tests regressed, but who cares about tests with 20
joiners in a row?! Devanagari at 57 (0.00821766%) now.
2012-07-24 02:37:42 -04:00
Behdad Esfahbod
330b329c89
[Indic] Unmark U+17D1 KHMER SIGN VIRIAM to NOT be a Virama
...
Fixes another 1 Khmer failure. Down to 30 (0.0100293%) now.
2012-07-24 02:25:26 -04:00
Behdad Esfahbod
6824a7194e
[Indic] Recategorize Khmer various signs as top matras
...
Khmer failures down from 39 to 31 (0.0103636%).
2012-07-24 02:22:18 -04:00
Behdad Esfahbod
d90b8e841e
[Indic] Reposition Khmer prebase-reordering Ra around split matras
...
In Khmer coeng model, a V,Ra can go *after* matras. If it goes after a
split matra, it should be reordered to *before* the left part of such matra.
Khmer failures down from 136 to 39 (0.0130381%).
2012-07-24 02:11:18 -04:00
Behdad Esfahbod
0afb84c125
[Indic] Fix minor bug in pre-base Ra positioning
2012-07-24 01:44:47 -04:00
Behdad Esfahbod
7573799126
[Indic] Position Khmer U+17CE
...
Fixes another 6 Khmer failures. Now at 136 (0.0454661%).
2012-07-24 01:32:07 -04:00
Behdad Esfahbod
8d00e8d0e7
[Indic] Don't reposition Khmer Bindu
...
Khmer Bindu doesn't like to move to syllable end. Leave it where it
was.
Brings down Khmer failures from 510 to 142 (0.047572%).
2012-07-24 01:15:34 -04:00
Behdad Esfahbod
2278eefcdb
[Indic] In Sinhala, form forced Reph even if no other consonant found
...
Fixes another 10 Sinhala failures. Down to 148 (0.0544424%).
2012-07-24 00:31:10 -04:00
Behdad Esfahbod
71fd5e80ad
[Indic] Further adjust base algorithm for Sinhala
...
Apparently if there is C,V,ZWJ,C, the first C will be base, but if
it's C,ZWJ,V,C, the second one will be.
Note that Uniscribe implements this differently, by breaking syllable in
the case of C,ZWJ,V,C and putting the first consonant in one syllable
and the rest in the next syllable.
Sinhala failures down from 208 to 158 (0.0581209%). No changes to
Khmer.
2012-07-24 00:21:16 -04:00
Behdad Esfahbod
73d71cc527
[Indic] End Vowel-based syllable at ZWJ
...
One Devanagari test regressed, plus 10 Malayalam (at 1545 now).
Fixed 120 Sinhala failures. Now at 208 (0.0765136%).
2012-07-24 00:09:12 -04:00
Behdad Esfahbod
34c215036f
[Indic] Improve Sinhala base algorithm and reph positioning
...
Sinhala does not have half forms. And most (all?) consonants can be
base, except when preceded by ZWJ, which would request a subjoined form.
Hence switch the base algorithm to categorize with Khmer, start search
at start, and stop at a ZWJ.
Also, mark all pos=base consonants after base to be subjoined. Mark
base itself to have pos=base.
Finally, adjust Sinhala's reph position to after-main.
Brings down Sinhala failures from 455 to 328 (0.120656%).
2012-07-23 23:51:29 -04:00
Behdad Esfahbod
2ec934c6c2
[Indic] Change "unknown" position to end of syllable
2012-07-23 23:49:04 -04:00
Behdad Esfahbod
b70021f7c8
When removing zero-width marks, don't remove ligatures
...
If a mark ligated, it probably should NOT be removed.
2012-07-23 20:18:17 -04:00
Behdad Esfahbod
49c5ec5144
Minor refactoring
2012-07-23 20:14:13 -04:00
Behdad Esfahbod
c3e6fdc379
[Indic] Improve check on ligatures
...
Only skip actual ligatures, not marks in-between ligature components.
2012-07-23 20:11:42 -04:00
Behdad Esfahbod
771a8f5028
[Indic] exclude ligatures when matching on Indic category
...
If, say, a H,ZWJ,C ligature was formed, we don't want the code to detec
that as a Halant. So, ignore ligatures when matching category in
final_reordering.
Sinhala failures down from 514 to 455 (0.167374%).
2012-07-23 20:09:30 -04:00
Behdad Esfahbod
d1af9e82e5
[GSUB/GPOS] Const correctness
2012-07-23 19:55:35 -04:00
Behdad Esfahbod
baacd090df
[Indic] Minor refactoring
2012-07-23 19:51:48 -04:00
Behdad Esfahbod
c7c4de2fb9
[Indic] Remove syllable length check before sorting
...
We now limit syllable lengths in the machine. No need to match here.
2012-07-23 18:25:02 -04:00
Behdad Esfahbod
9fa052733e
[Indic] Limit syllables to at most five consonants
...
Seems to be about what Uniscribe does. Not exactly. But close enough.
More consonants will start a new cluster.
A few scripts went way down in failures. In particular:
- Devanagari failures went down from 490 to 56.
- Telugu went down from 113 to 49.
Other scripts went down slightly or didn't change. New numbers:
BENGALI: 353908 out of 354285 tests passed. 377 failed (0.106412%)
DEVANAGARI: 693572 out of 693628 tests passed. 56 failed (0.00807349%)
GUJARATI: 366485 out of 366506 tests passed. 21 failed (0.00572978%)
GURMUKHI: 60750 out of 60809 tests passed. 59 failed (0.0970251%)
KANNADA: 950730 out of 951913 tests passed. 1183 failed (0.124276%)
KHMER: 298613 out of 299124 tests passed. 511 failed (0.170832%)
MALAYALAM: 1046881 out of 1048416 tests passed. 1535 failed (0.146411%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271333 out of 271847 tests passed. 514 failed (0.189077%)
TAMIL: 1091837 out of 1091837 tests passed. 0 failed (0%)
TELUGU: 970524 out of 970573 tests passed. 49 failed (0.00504856%)
Some of the remaining Telugu and Devanagari issues seem to be Uniscribe
eating Anusvara when placed before a non-joiner. Ouch!
2012-07-23 18:19:17 -04:00
Behdad Esfahbod
093cd58326
[Thai] Fix SARA AM handling
...
Oops, thinko.
2012-07-23 14:04:42 -04:00
Behdad Esfahbod
42848453bf
[Thai] Reorder U+0E3A THAI VOWEL SIGN PHINTHU
...
Uniscribe reorders U+0E3A to be after U+0E38 and U+0E39. We do that by
modifying the ccc for U+0E3A.
Fixes the two remaining Thai failures (see previous commit).
2012-07-23 13:52:07 -04:00
Behdad Esfahbod
4a7f4f3e56
[Thai] Adjust SARA AM reordering to match Uniscribe
...
Adjust the list of marks before SARA AM that get the reordering
treatment. Also adjust cluster formation to match Uniscribe.
With Wikipedia test data, now I see:
- For Thai, with the Angsana New font from Win7, I see 54 failures out
of over 4M tests (0.00129107%). Of the 54, two are legitimate
reordering issues (fix coming soon), and the other 52 are simply
Uniscribe using a zero-width space char instead of an unknown
character for missing glyphs. No idea why. The missing-glyph
sequences include one that is a Thai character followed by an Arabic
Sokun. Someone confused it with Nikhahit I assume!
- For Lao, with the Dokchampa font from Win7, 33 tests fail out of
54k (0.0615167%). All seem to be insignificant mark positioning
with two marks on a base. Have to investigate.
2012-07-23 13:15:33 -04:00
Behdad Esfahbod
2cc933aff9
[Indic] Fix cluster formation with left-matras and conjunct forms
...
Test case was: <U+0D15,U+0D4D,U+0D15,U+0D4A>.
2012-07-23 08:23:44 -04:00
Behdad Esfahbod
e6b01a878c
[Indic] Further streamline cluster formation
...
This should address all possible cluster misformations that I had in
mind.
2012-07-23 00:11:26 -04:00
Behdad Esfahbod
7b2a7dadd6
[Indic] Merge clusters before sorting
...
This should fix any instabilities in cluster formation that we were
speculating may happen with surrounding syllables. Or most of it
perhaps.
2012-07-22 23:58:55 -04:00
Behdad Esfahbod
abb3239ef9
[Indic] Update clusters for left-matra even if matra didn't move
...
Fixes crashes reported with left matra under
non-uniscribe-bug-compatibilty mode.
2012-07-22 23:55:19 -04:00
Behdad Esfahbod
92a1ad7bef
[Indic] Stop searching for base if a post form is found before below form
...
Improves Bengali and Gurmukhi. Malayalam regressed a bit. We will deal
with that later.
2012-07-20 18:55:15 -04:00
Behdad Esfahbod
4c450c703f
[Indic] Recompose Bengali Ya,Nukta
...
This is a bunch of hacks for now.
Improves Bengali a bit.
2012-07-20 18:13:04 -04:00
Behdad Esfahbod
e9c0f152a3
[Uniscribe] Fix script fallback
...
Gurmukhi failures half now. Others changed slightly.
2012-07-20 17:37:48 -04:00
Behdad Esfahbod
5791f32915
[Indic] Allow a ZWNJ after SM's
...
Malayalam failures go way down. Other scripts benefitted slightly too.
Sinhala had one or two test regressions, but...
2012-07-20 16:26:55 -04:00
Behdad Esfahbod
34ae336f3f
[Indic] Improve Reph AfterMain positioning
...
Fixes 20 out of 48 failing Oriya tests. Failure rate down to 0.066% now.
2012-07-20 16:17:28 -04:00
Behdad Esfahbod
bdd080431a
[Indic] Reposition Oriya Candrabindu
...
Oriya failures down from 0.65% to 0.20%.
2012-07-20 16:03:09 -04:00
Behdad Esfahbod
5f0eaaad12
[Indic] Fix base search in final_reordering
...
Fixes most Malayalam failures. Down from 1.6% to 0.38% now. Fixes a
few more in other scripts too.
2012-07-20 15:47:24 -04:00
Behdad Esfahbod
81202bd860
[Indic] Don't attach SM/VD to other characters
2012-07-20 15:14:51 -04:00
Behdad Esfahbod
efb4ad7356
Fix compiler warnings
...
If x is not constant, we cannot ASSERT_STATIC on it.
2012-07-20 14:27:38 -04:00
Behdad Esfahbod
f31d97e44e
[Indic] Form Telugu Reph out of Ra,Virama,ZWJ
...
Apparently this was approved in Feb 2012. No font yet.
2012-07-20 14:13:35 -04:00
Behdad Esfahbod
2e193b240e
[Indic] Don't split U+0AC9
...
Althought IndicMatraCategory.txt classifies it as Top_And_Right matra,
it does not have Unicode decomposition, and Uniscribe does not do
anything special about it either.
Gujarati failures down from 0.672% to 0.0130966%.
2012-07-20 14:02:35 -04:00
Behdad Esfahbod
30c3d5e9fc
[Indic] Simplify Uniscribe cluster emulation
...
Now that we break syllables on Halant,ZWNJ, this code can be simplified.
2012-07-20 13:56:32 -04:00
Behdad Esfahbod
decf6ffca4
[Indic] Minor!
2012-07-20 13:51:31 -04:00
Behdad Esfahbod
9e4f94a72c
[Indic] Break syllables at Halant,ZWNJ
...
That's really what Uniscribe does, and explains a lot of pecularities of
Halant,ZWNJ before the base.
Sent Telugu from 1% failures to 0.03%. Improved Kannada and Malayalam
slightly. Fixed half of Bengali, and did NOT break anything!
2012-07-20 13:48:03 -04:00
Behdad Esfahbod
2c372b80f6
[Indic] Better check for applying 'init'
...
Specifically, don't apply 'init' if previous char is a joiner.
Fixes some more of Bengali.
2012-07-20 13:37:48 -04:00
Behdad Esfahbod
34a7440b7c
[GPOS] Don't zero mark advances
...
Fixes more of Telugu, Kannada, and Oriya.
May break things (outside Indic...), but we cannot think of any font relying
on this immediately.
2012-07-20 12:40:39 -04:00
Behdad Esfahbod
8ed248de77
[Indic] Minor
2012-07-20 11:42:24 -04:00
Behdad Esfahbod
d0e68dbd0b
[Indic] Implement reph positioning step 5
...
Not tuned, just copied from step 2. Fixes another 0.5% of Kannada
failures. 1% to go.
2012-07-20 11:25:41 -04:00
Behdad Esfahbod
a9e45c32e4
[Indic] Don't let ZWNJ at the end of syllable affect base search
...
Fixes a few Devanagari, half of remaining Kannada failures, quarter for
Telugu, and others slightly improved or unchanged.
2012-07-20 11:04:15 -04:00
Behdad Esfahbod
20b68e699f
[Indic] Apply 'cjct' globally
...
Fixes 5 Devanagari failures, and no regressions.
2012-07-20 10:47:46 -04:00
Behdad Esfahbod
51e764de44
[Indic] Unbreak old scriptures
...
Brings down failures with Lohit-Telugu from 57% to 1.40%.
2012-07-20 10:30:24 -04:00
Behdad Esfahbod
900cf3d449
Minor
2012-07-20 10:18:23 -04:00
Behdad Esfahbod
87cd63266e
[Indic] Recategorize some Kannada right matras
...
Kannada failures down from 3.5% to 2.93%.
2012-07-19 21:25:46 -04:00
Behdad Esfahbod
3604d64ced
[Indic] Recategorize GURMUKHI ADDAK
...
It's not in IndicSyllabicCategory.txt. Fixes most of Gurmukhi failures.
Failures down from 7.7% to 0.222%!
2012-07-19 21:13:04 -04:00
Behdad Esfahbod
8932858123
Minor
2012-07-19 21:02:38 -04:00
Behdad Esfahbod
47ef931f13
[buffer] Make sure out_info = info during GPOS
2012-07-19 20:52:44 -04:00
Behdad Esfahbod
ae63cf2062
Print line number during return when tracing
2012-07-19 20:45:41 -04:00
Behdad Esfahbod
5249f3aee1
[Indic] Unbreak Khmer
...
For Khmer, all consonants are subjoining. No need to look in the font.
We were looking in the wrong order anyway.
2012-07-19 20:30:22 -04:00
Behdad Esfahbod
e0475345d5
[Indic] Apply 'akhn' globally
...
Fixes 1.5% more failures for Telugu, 2% for Kannada.
Breaks one test in Devanagari.
2012-07-19 20:24:14 -04:00
Behdad Esfahbod
fa247ebe52
[Indic] Better position U+0CD5
...
Fixes another 5% of Kannada failures.
2012-07-19 19:52:19 -04:00
Behdad Esfahbod
f055442716
[Indic] Lookup consonant position in the font
...
Fixes most failures of Oriya, and improves others a bit.
2012-07-19 16:20:21 -04:00
Behdad Esfahbod
74d1d88781
[GSUB] Fix would_apply() for LigatureSubst
2012-07-19 16:14:23 -04:00
Behdad Esfahbod
be73a5f936
Add src/test-would-substitute tool
2012-07-19 15:12:18 -04:00
Behdad Esfahbod
e72b360ac6
Refactor / finish would_apply() operation
...
Untested.
2012-07-19 14:44:46 -04:00
Behdad Esfahbod
8c973ebf0f
[Indic] Implement per-script matra positioning
...
Following what the spec says.
Brings down Telugu failures from 40% to 3.75%, and Kannada failures from
44% to 10%. Does NOT affect other scripts' test results.
2012-07-19 13:25:08 -04:00
Behdad Esfahbod
8bb32458f9
[Indic] More refactoring
2012-07-19 13:04:44 -04:00
Behdad Esfahbod
9ccc6382ba
[Indic] Minor refactoring
2012-07-19 12:45:31 -04:00
Behdad Esfahbod
f83aaa3133
[Indic] Minor
2012-07-19 12:23:23 -04:00
Behdad Esfahbod
be8b9f5f71
[Indic] Start refactoring different matra positions per script
2012-07-19 12:11:12 -04:00
Behdad Esfahbod
b01d9b3d90
[Indic] Disallow decomposition of a couple characters
...
This is a hack for now. Will be fixed when we do complex-shaper-driven
normalization properly.
The results with or without decomposition are the same, but Uniscribe
does not normalize, so this matches better.
2012-07-19 11:25:49 -04:00
Behdad Esfahbod
422ecd2d3c
[Indic] Accept a forced Rakar sequence at the end of syllable
...
In Sinhala, Rakar is formed by Al-Lakuna,ZWJ,Ra. If you put that at the
end of a Consonant,Matra syllable, you get a dotted-circle from
Uniscribe. Apparently adding a ZWJ before the Al-Lakuna "fixes" that.
And people have been encoding that sequence... So, allow a forced
"ZWJ,Virama,ZWJ,Ra" sequence at the of syllables.
Fixes some 100 or more of Sinhala failures. Now at 622 only (0.23%).
2012-07-18 23:25:58 -04:00
Behdad Esfahbod
6fc1732003
[Indic] Allow joiners on both sides of Halant at the same time
...
The sequence <ZWJ,Al-Lakuna,ZWJ> is used in Sinhala to explicitly ask
for Rakar. Fixes two-thousand Sinhala tests. Not many left.
2012-07-18 17:49:19 -04:00
Behdad Esfahbod
10cdc94eee
[Indic] In final reordering, find base, even if it disappeared
...
POS_BASE can disappear if base ligated backward. Define base as last
with position not after base.
Fixes a few hundred of Sinhala failures with Iskoola Pota.
2012-07-18 17:43:23 -04:00
Behdad Esfahbod
9c4d24a3a6
[Indic] Minor
2012-07-18 17:29:10 -04:00
Behdad Esfahbod
3285e107c9
[Indic] Implement Sinhala "Al Lakuna" Reph behavior
...
In Sinhala, Reph is formed only explicitly, by the presence of a ZWJ.
2012-07-18 17:22:14 -04:00
Behdad Esfahbod
91cade7555
[Indic/Unicode] Decompose Sinhala split matras the way Uniscribe likes
...
Makes no visual difference.
Fixes most of the failures. Down from 15% to 1.3%!
2012-07-18 16:50:41 -04:00
Behdad Esfahbod
d8942dcbb4
Apply Tibetan (global) features.
...
Fixes all Tibetan failures. All 180k of them!
Merges back Hangul into the default shaper.
2012-07-18 16:34:10 -04:00
Behdad Esfahbod
552d19b7a1
[Indic] Treat Register Shifters like Nukta
...
Really this time.
Fixes another 18 Khmer tests.
2012-07-18 16:02:33 -04:00
Behdad Esfahbod
e8cd81f76d
[Indic] Minor
2012-07-18 16:00:20 -04:00
Behdad Esfahbod
69f26bf39c
[Indic] Fix Matra reordering when base is at end of syllable
...
For example: U+915,U+200c,U+93f
Fixes last Tamil failure!
2012-07-18 15:47:51 -04:00
Behdad Esfahbod
d16ccc4ae7
Leave one extra item at the end of buffer allocation
...
Just in case, for the times we do out-of-bounds access.
jk
2012-07-18 15:43:55 -04:00
Behdad Esfahbod
075d671f10
[Indic] Fix out-of-bounds array access
2012-07-18 15:41:53 -04:00
Behdad Esfahbod
dcb527242b
[Indic] Allow joiners before matras
...
Fixes 1 more Devanagari test!
2012-07-18 15:32:26 -04:00
Behdad Esfahbod
391cc03317
[Indic] Allow halant group in Vowel and placeholder syllables
...
Fixes 2 out of 560 Devanagari failures. AND:
Fixes 1 out of 2 Tamil failures.
2012-07-18 15:12:49 -04:00
Behdad Esfahbod
ca4e3d3eab
[Indic] Streamline halant/joiner in grammar
2012-07-18 15:05:40 -04:00
Behdad Esfahbod
418d00dffd
[Indic] Minor
2012-07-18 14:57:28 -04:00
Behdad Esfahbod
4c3691d2a3
[Indic] Hopefully minor!
...
Refactoring Indic machin. No semantic change.
2012-07-18 14:23:55 -04:00
Behdad Esfahbod
e092c556fb
[Indic] Minor
2012-07-18 14:09:25 -04:00
Behdad Esfahbod
14dbdd9e39
[Indic] Unbreak Tamil
...
Tamil has only about 150 failures now!
2012-07-18 13:13:03 -04:00
Behdad Esfahbod
db8981f1e0
[Indic] Position Khmer Robat
...
It's a visual Repha.
Still not positioning logical Repha as occurs in Malayalam.
Another 200 Khmer failures fixed. 547 to go. That's better than
Devanagari!
2012-07-17 23:42:04 -04:00
Behdad Esfahbod
25bc489498
[Indic] Better categorize Register Shifters and Khmer Various signs
...
Down another 500 or so Khmer failures!
2012-07-17 17:53:03 -04:00
Behdad Esfahbod
39b17837b4
Add hb_buffer_normalize_glyphs() and hb-shape --normalize-glyphs
...
This reorders glyphs within the cluster to a nominal order. This should
have no visible effect on the output, but helps with testing, for
getting the same hb-shape output for visually-equal glyphs for each
cluster.
2012-07-17 17:09:29 -04:00
Behdad Esfahbod
25e302da9a
[Indic] Minor
2012-07-17 14:25:14 -04:00
Behdad Esfahbod
5d32690a34
[Indic] For scripts without Half forms, always choose first consonant as base
...
In such scripts (ie. Khmer), a ZWJ/ZWNJ shouldn't stop the search for
base. So, instead just choose the first consonant as base directly.
Test sequence:
U+1798,200c,U+17C9,U+17D2,U+179B,U+17C1,U+17C7
2012-07-17 14:23:28 -04:00
Behdad Esfahbod
34b5714906
[Indic] Treat Khmer Register Shifters more like Nuktas
...
Except that there may be a ZWNJ before a Register Shifter.
2012-07-17 14:09:32 -04:00
Behdad Esfahbod
11e2a601b1
[Indic] Minor
2012-07-17 14:02:28 -04:00
Behdad Esfahbod
0201e0a464
[Indic] Apply 'cfar' for Khmer
...
Mark stuff after a pre-base reordering Ro 'cfar'. Used in Khmer.
This allows distinguishing the following cases with MS Khmer fonts:
U+1784,U+17D2,U+179A,U+17D2,U+1782
U+1784,U+17D2,U+1782,U+17D2,U+179A
2012-07-17 13:56:24 -04:00
Behdad Esfahbod
55f70ebfb9
[Indic] Position final subjoined consonants (and vowels) after matras
...
In Khmer, a final subjoined consonant or independent vowel can occur
after matras. This final subjoined thing should NOT be reordered to
before the matra even though it's subjoined.
Fixes another 1k of the Khmer failures. Not much left really.
2012-07-17 12:50:13 -04:00
Behdad Esfahbod
c50ed71e9a
[Indic] Recategorize Khmer coeng sign as a separate category OT_Coeng
...
Amend the syllable structure to allow a final subscripted consonant
(Coeng+C) and a final subscripted independent vowel (Coeng+V).
Fixes another 2k of Khmer failures.
2012-07-17 11:54:28 -04:00
Behdad Esfahbod
deb521dee4
[Indic] Add a separate Coeng class
...
No characters recategorized yet. No semantic change.
2012-07-17 11:37:32 -04:00
Behdad Esfahbod
74ccc6a132
[Indic] Move Halant with after-base consonants
...
Normally, we attach the Halant to the previous character and move it
with it. For after-base consonants however, the Halant "belongs" to the
consonant after, so attach it so.
This fixes Bengali sequences involving post-base consonant Ya, which
should ligate with the Halant to form Ya Phala, but previously a
reordered matras was blocking the ligation.
2012-07-17 11:16:19 -04:00
Behdad Esfahbod
d5c4edcdd6
[Indic] Apply presentation-forms features all at once
...
Seems like this is what Uniscribe is doing, and does not break any fonts
we tested (with Devanagari, Malayalam, Khmer, and Bengali), while fixing
some Ra Phala sequences for Bengali with Vrinda. Fixes another 2% of
Bengali failures (a couple more to go).
2012-07-17 10:40:59 -04:00
Behdad Esfahbod
559f706678
Fix MarkAttachmentType matching
...
Fixes issue reported by Khaled Hosny with his Hussaini Nastaleeq font
and sequences like those added in the previous commit.
2012-07-16 22:46:52 -04:00
Behdad Esfahbod
ad4494759f
Minor
2012-07-16 22:40:21 -04:00
Behdad Esfahbod
af92b4cc90
[Indic] Disable 'kern' in Uniscribe bug compatibility mode
...
Uniscribe does not apply 'kern' in the Indic module. Some of the Khmer
fonts they ship have small adjustments in the 'kern' table. Disable
'kern' in the Indic module under Uniscribe bug compatibility mode.
Fixes some 10% of the Khmer failures. Remains under 3% (excluding
dotted-circle ones).
2012-07-16 20:31:24 -04:00
Behdad Esfahbod
d96838ef95
Allow complex shapers overriding common features
...
In a new callback... Currently unused by all complex shapers.
2012-07-16 20:26:57 -04:00
Behdad Esfahbod
df50b84740
[Indic] Categorize other Khmer marks
...
Mark them the same as the Register Shifters for now. Need to rename
that category to something more sensible after all is settled.
Fixes another percent of Khmer failures. Down to under 3%!
2012-07-16 20:14:50 -04:00
Behdad Esfahbod
8e7b5882fb
[Indic] Recognize pre-base reordering Ra anywhere in the syllable
...
We were doing that only immediately after base.
Fixes another percent in the Khmer failures. About three more to go...
2012-07-16 17:04:46 -04:00
Behdad Esfahbod
7d09c98a1f
[Indic] Recognizer Register Shifter marks
...
Fixes another 6% of the Khmer failures.
2012-07-16 16:45:22 -04:00
Behdad Esfahbod
60da763dfa
[GSUB/GDEF] Guess glyph classes after substitution only if no GDEF
...
Brings down Khmer failures with Daun Penh font from 36% to 20%.
2012-07-16 16:14:40 -04:00
Behdad Esfahbod
fcdc5f1c88
[Indic] Categorize Khmer Ro
...
Khmer failures down from 58% to 36%.
2012-07-16 15:52:54 -04:00
Behdad Esfahbod
78818124b1
[Indic] Reoder pre-base reordering Ra
...
Brings down Malayalam failures from 14% down to 3%.
2012-07-16 15:49:08 -04:00
Behdad Esfahbod
1a1dbe9a27
[Indic] Rename
2012-07-16 15:41:33 -04:00
Behdad Esfahbod
46e645ec4b
[Indic] Start implementing pre-base reordering
2012-07-16 15:30:05 -04:00
Behdad Esfahbod
921ce5b17d
[Indic] Rename
...
No semantic change.
2012-07-16 15:26:56 -04:00
Behdad Esfahbod
b504e060f0
[Indic] Implement After-Main Reph positioning
...
Almost...
2012-07-16 15:21:12 -04:00
Behdad Esfahbod
17d7de91d7
[Indic] Apply 'pref' to pre-base reodering Ra
...
No reordering yet.
2012-07-16 15:20:15 -04:00
Behdad Esfahbod
362d3db8d3
[Indic] Minor
...
Should not be any semantic change. In preparation for implementing
pre-base reordering Ra.
2012-07-16 15:15:28 -04:00
Behdad Esfahbod
70fe77bb9a
Minor
2012-07-16 14:52:18 -04:00
Behdad Esfahbod
2f903215c5
Minor
2012-07-16 13:54:43 -04:00
Behdad Esfahbod
a3e04bee2c
[Indic] Reorder virama only for old Indic spec
2012-07-16 13:47:19 -04:00
Behdad Esfahbod
0de771b72d
[Indic] Categorize Khmer consonants
2012-07-16 13:39:36 -04:00
Behdad Esfahbod
d487fff266
Split matras without a Unicode decomposition
...
This is a hack for now, to get us going with Khmer. This will be
refactored properly later to move the complex logic into complex
shapers.
2012-07-16 13:25:57 -04:00
Behdad Esfahbod
8aa801a6fd
[Indic] Adjust position for split matras
...
We are going to split matras without a Unicode decompositions in a way
that the second half takes the codepoint of the whole matra. So,
position them where the second half is supposed to end up.
2012-07-16 13:24:26 -04:00
Behdad Esfahbod
1feb8345a5
[GSUB] Allow 1-to-1 ligature substitutions!
...
Apparently Uniscribe allows these, and they are used in some Khmer fonts
shipped with Windows, namely, Daun Penh.
2012-07-16 13:23:40 -04:00
Behdad Esfahbod
29f106d7fb
[Indic] Apply Above Forms
2012-07-16 12:05:35 -04:00
Behdad Esfahbod
fa2bd9fb63
Further simplify atomic ops on Visual Studio
2012-07-14 12:15:54 -04:00
Behdad Esfahbod
0a49235701
Minor
2012-07-13 13:20:49 -04:00
Behdad Esfahbod
11c4ad439e
Add -Wcast-align
2012-07-13 11:29:31 -04:00
Behdad Esfahbod
a98d0ab186
Make sure HB_BEGIN_DECLS / HB_END_DECLS is only used in public headers
...
So we can use them to switch default visibility to internal if desired,
and use these to make only declared symbols public.
2012-07-13 10:19:10 -04:00
Behdad Esfahbod
5c5bc96216
Allow overriding HB_BEGIN_DECLS / HB_END_DECLS
2012-07-13 10:15:37 -04:00
Behdad Esfahbod
50a4e78b53
Check for exported weak symbols
...
Ouch, all our C++ inline functions are being exported (weakly) already.
Fix coming.
2012-07-13 09:48:39 -04:00
Behdad Esfahbod
b5aeb95afe
Make hb_in_range() static
2012-07-13 09:45:54 -04:00
Behdad Esfahbod
271c8f8907
Minor
2012-07-13 09:32:30 -04:00
Behdad Esfahbod
391f1ff5d8
Fix _InterlockedCompareExchangePointer on x86
2012-07-13 09:04:07 -04:00
Behdad Esfahbod
2023e2b54d
[ft] Disable ppem setting
...
The calculations were wrong.
FreeType makes it really hard to set size and ppem independently.
For now, disable it. Need to come up with a fix later.
2012-07-11 19:01:26 -04:00
Behdad Esfahbod
cdf7444505
[ft] Use unfitted kerning if x_ppem is zero
2012-07-11 18:52:39 -04:00
Behdad Esfahbod
6d08c7f1b3
Revert "Towards templatizing common Lookup types"
...
This reverts commit 727135f3a9
.
This is work-in-progress. Didn't mean to push it out just yet.
2012-07-11 18:01:27 -04:00
Behdad Esfahbod
552bf3a9f9
Bump WINNT version requested from 500 to 600
...
Since we use the OpenType versions of Uniscribe functions, we are
relying on that version of the WINNT API. Otherwise, usp10.h will hide
those symbols.
2012-07-11 18:00:28 -04:00
Behdad Esfahbod
9a5b421a64
Fix build with no Unicode funcs implementations provided
2012-07-11 18:00:28 -04:00
Behdad Esfahbod
727135f3a9
Towards templatizing common Lookup types
2012-07-11 18:00:28 -04:00
Behdad Esfahbod
12f5c0a222
Fix check for Intel atomic ops
2012-06-26 11:16:13 -04:00
Behdad Esfahbod
6932a41fb6
Use octal-escaped UTF-8 characters instead of plain text
...
https://bugs.freedesktop.org/show_bug.cgi?id=50970
2012-06-26 10:46:31 -04:00
Behdad Esfahbod
8c0ea7bcb4
Disable introspection again
...
Until I figure out the build issues. Sigh...
2012-06-24 13:20:56 -04:00
Behdad Esfahbod
49f8e0cd9a
GStaticMutex is deprecated
2012-06-16 15:40:03 -04:00
Behdad Esfahbod
1bc1cb3603
Make source more digestable for gobject-introspection
2012-06-16 15:21:55 -04:00
Behdad Esfahbod
84d781e54c
Flesh out gobject-introspection stuff a bit
2012-06-16 15:21:41 -04:00
Behdad Esfahbod
2cf301968c
Add hb_object_lock/unlock()
2012-06-09 14:58:01 -04:00
Behdad Esfahbod
f211d5c291
More Oops! Fix fast-path with sub-type==0
2012-06-09 03:11:22 -04:00
Behdad Esfahbod
b1de6aa1f3
Oops!
2012-06-09 03:07:59 -04:00
Behdad Esfahbod
b12e2549cb
Minor
2012-06-09 03:05:20 -04:00
Behdad Esfahbod
faf0f20253
Add sanitize() logic for fast-paths
2012-06-09 03:02:36 -04:00
Behdad Esfahbod
4e766ff28d
Add fast-path for GPOS too
...
Shaves another 3% for DejaVu Sans long Latin strings.
2012-06-09 02:53:57 -04:00
Behdad Esfahbod
993c51915f
Add fast-path to GSUB to check coverage
...
Shaves a good 10% off DejaVu Sans with simple Latin text for me.
Now, DejaVu is very ChainContext-intensive, but it's also a very
popular font!
2012-06-09 02:48:16 -04:00
Behdad Esfahbod
f19e0b0099
Match input before backtrack
...
Makes more sense, optimization-wise.
2012-06-09 02:26:57 -04:00
Behdad Esfahbod
67bb9e8cea
Add set add_coverage() to Coverage()
2012-06-09 02:02:46 -04:00
Behdad Esfahbod
4952f0aa5b
Minor
2012-06-09 01:43:20 -04:00
Behdad Esfahbod
ad6a6f2240
Minor
2012-06-09 01:43:20 -04:00
Behdad Esfahbod
46617a4213
Fix cache implementation
2012-06-09 01:43:20 -04:00
Behdad Esfahbod
ce47613889
Micro-optimize
...
I know...
2012-06-09 01:43:15 -04:00
Behdad Esfahbod
70416de298
Minor
2012-06-09 00:56:41 -04:00
Behdad Esfahbod
99159e52a3
Use linear search for small counts
...
I see about 8% speedup with long strings with DejaVu Sans.
2012-06-09 00:50:40 -04:00
Behdad Esfahbod
caf0412690
Minor
2012-06-09 00:26:32 -04:00
Behdad Esfahbod
0f8fea71a6
Minor. Hide _hb_ot_layout_get_glyph_property()
2012-06-09 00:24:38 -04:00
Behdad Esfahbod
44b8ee0c90
Minor
2012-06-09 00:23:24 -04:00
Behdad Esfahbod
7b84c536c1
In MarkBase attachment, only attach to first of a MultipleSubst sequence
...
This is apparently what Uniscribe does. Test case is:
SEEN FATHA TEH ALEF
with Arabic Typesetting. Originally reported by Khaled Hosny.
2012-06-08 22:04:23 -04:00
Behdad Esfahbod
ec57e0c565
Set lig_comp for MultipleSubst components
...
To be used for correct mark attachment to first component of a
MultipleSubst output. That's what Uniscribe does.
2012-06-08 21:47:23 -04:00
Behdad Esfahbod
e085fcf7ca
Remove unused buffer->replace_glyphs_be16
2012-06-08 21:45:00 -04:00
Behdad Esfahbod
3ec77d6ae0
Don't use replace_glyphs_be for MultipleSubst
2012-06-08 21:44:06 -04:00
Behdad Esfahbod
4b7192125f
Minor
2012-06-08 21:41:46 -04:00
Behdad Esfahbod
4508789f4b
Add test for static initializers and other C++ stuff
2012-06-08 21:32:43 -04:00
Behdad Esfahbod
56bd259b9a
Minor
2012-06-08 21:29:18 -04:00
Behdad Esfahbod
bc8357ea7b
Merge clusters during normalization
2012-06-08 21:01:20 -04:00
Behdad Esfahbod
fe3dabc08d
Minor
2012-06-08 20:56:05 -04:00
Behdad Esfahbod
e88e14421a
Use merge_clusters instead of open-coding
2012-06-08 20:55:21 -04:00
Behdad Esfahbod
330a2af3ff
Use merge_clusters when forming Unicode clusters
2012-06-08 20:40:02 -04:00
Behdad Esfahbod
bd300df9ad
Minor
2012-06-08 20:36:37 -04:00
Behdad Esfahbod
e51d2b6ed1
Extend into main buffer if extension hit end of out-buffer merging clusters
2012-06-08 20:36:33 -04:00
Behdad Esfahbod
5ced012d9f
Extend end when merging clusters in out-buffer
2012-06-08 20:31:32 -04:00
Behdad Esfahbod
72c0a18783
Extend clusters backward in out-buffer
2012-06-08 20:30:03 -04:00
Behdad Esfahbod
cd5891493d
Extend clusters backwards, into the out-buffer too
2012-06-08 20:28:59 -04:00
Behdad Esfahbod
77471e0371
Clear output buffer before calling GSUB pause functions
2012-06-08 20:21:02 -04:00
Behdad Esfahbod
cafa6f3727
When merging clusters, extend the end
2012-06-08 20:17:10 -04:00
Behdad Esfahbod
28ce5fa454
Merge clusters when ligating
2012-06-08 20:17:06 -04:00
Behdad Esfahbod
2bb1761ccb
Minor, use next_glyph()
2012-06-08 19:29:44 -04:00
Behdad Esfahbod
5f68f8675e
Minor
2012-06-08 19:23:43 -04:00
Behdad Esfahbod
8729691267
Increase Uniscribe MAX_ITEMS
2012-06-08 14:39:31 -04:00
Behdad Esfahbod
dbffa4c83d
Fix Uniscribe charset matching
...
Previously was failing to match fonts that didn't support CHARSET_ANSI.
There still remains a problem with the Uniscribe backend, in that if a
font with the same family name is installed, and is newer, the native
one is preferred over the font we provide. Fixing it requires rewriting
the name table with a unique family name...
2012-06-08 14:39:31 -04:00
Behdad Esfahbod
82e8bd8628
Remove unused code
2012-06-08 14:39:31 -04:00
Behdad Esfahbod
6da9dbff21
Remove zero-width chars in the fallback shaper too
2012-06-08 10:53:35 -04:00
Behdad Esfahbod
68b76121f8
Fix regressions introduced by sed. Ouch!
...
Introduced in 99c2695759
.
Broken mark-mark and mark-ligature stuff.
2012-06-08 10:47:00 -04:00
Behdad Esfahbod
0dd86f9f68
Whitespace
2012-06-08 10:23:03 -04:00
Behdad Esfahbod
8e7beba7c3
Fix Uniscribe clusters with direction-overriden Arabic
2012-06-08 10:22:06 -04:00
Behdad Esfahbod
b069c3c31b
Really fix override-direction in Uniscribe
2012-06-08 10:10:29 -04:00
Behdad Esfahbod
fcd6f53261
Unbreak Uniscribe
...
Oops. hb_tag_t and OPENTYPE_TAG have different endianness. Perhaps
something to add API for in hb-uniscribe.h
2012-06-08 09:59:43 -04:00
Behdad Esfahbod
29eac8f591
Override direction in Uniscribe backend
...
Matches OT backend now.
2012-06-08 09:26:17 -04:00
Behdad Esfahbod
1c1233e576
Make Uniscribe backend respect selected script
2012-06-08 09:20:53 -04:00
Behdad Esfahbod
0bb0f5d419
Add note re _NullPool
2012-06-07 17:42:48 -04:00
Behdad Esfahbod
2a3d911fe0
Fix alignment-requirement missmatch
...
Detected by clang and lots of cmdline options.
2012-06-07 17:31:46 -04:00
Behdad Esfahbod
6095de1635
Fix clang warning with NO_MT path
2012-06-07 15:48:18 -04:00
Behdad Esfahbod
a18280a8ce
Fix warnings produced by clang analyzer
2012-06-07 15:44:12 -04:00
Behdad Esfahbod
73cb02de2d
Minor
2012-06-06 11:29:25 -04:00
Behdad Esfahbod
79e2b4791f
Fix ASSERT_POD on clang
...
As reported by bashi. Not tested.
2012-06-06 11:27:17 -04:00
Behdad Esfahbod
6220e5fc0d
Add ASSERT_POD for most objects
2012-06-06 03:30:09 -04:00
Behdad Esfahbod
a00a63b5ef
Add macros to check that types are POD
2012-06-06 03:07:01 -04:00
Behdad Esfahbod
61eb60c129
Don't link to libstdc++
...
New try.
2012-06-05 21:22:36 -04:00
Behdad Esfahbod
81a4b9fd4e
Remove unused hb_static_mutex_t
2012-06-05 20:53:00 -04:00
Behdad Esfahbod
4a3a9897b3
Disable Intel atomic ops on mingw32
...
Apparently the configure test is not enough...
2012-06-05 20:39:07 -04:00
Behdad Esfahbod
0594a24484
Cleanup TRUE/FALSE vs true/false
2012-06-05 20:35:40 -04:00
Behdad Esfahbod
e1ac38f8dd
Fix inert buffer set_length() with zero
...
Oops!
2012-06-05 20:31:49 -04:00
Behdad Esfahbod
04bc1eebe7
Add configure tests for Intel atomic intrinsics
2012-06-05 20:16:56 -04:00
Behdad Esfahbod
f64b2ebf82
Remove last static initializer
...
We're free! Lazy or immediate...
2012-06-05 20:15:27 -04:00
Behdad Esfahbod
04aed572f1
Make hb-ft static-initializer free
2012-06-05 18:45:36 -04:00
Behdad Esfahbod
be4560a3b5
Undo default unicode-funcs to avoid static initializer again
2012-06-05 18:43:57 -04:00
Behdad Esfahbod
093171ccec
Implement lock-free hb_language_t
...
Another static-initialization down. One more to go.
2012-06-05 18:00:45 -04:00
Behdad Esfahbod
6843ce01be
Add atomic-pointer functions
...
Gonig to use these for lock-free linked-lists, to be used for
hb_language_t among other things.
2012-06-05 17:27:20 -04:00
Behdad Esfahbod
cdafe3a7d8
Add gcc intrinsics implementations for atomic and mutex
2012-06-05 16:40:23 -04:00
Behdad Esfahbod
d970d2899b
Add gcc implementation for atomic ops
2012-06-05 16:06:28 -04:00
Behdad Esfahbod
0e253e97af
Add a mutex to object header
...
Removes one more static-initialization. A few more to go.
2012-06-05 15:54:43 -04:00
Behdad Esfahbod
a2b471df82
Remove static initializers from indic
2012-06-05 15:17:44 -04:00
Behdad Esfahbod
f06ab8a426
Better hide nil objects and make them const
2012-06-05 14:49:14 -04:00
Behdad Esfahbod
bf93b636c4
Remove constructor from hb_prealloced_array_t
...
This was causing all object types to be non-POD and have static
initializers. We don't need that!
Now, most nil objects just moved from .bss to .data. Fixing for that
coming soon.
2012-06-05 14:17:32 -04:00
Behdad Esfahbod
f1971a2174
Fix warnings
2012-06-05 14:06:04 -04:00
Behdad Esfahbod
9fc7a11469
Remove comma at the end of enum
...
As reported by Jonathan Kew on the list.
2012-06-04 08:28:19 -04:00
Behdad Esfahbod
3b8fd9c48f
Remove const from ref_count.ref_count
...
According to Tom Hacohen this was breaking build with some compilers.
In file included from hb-buffer-private.hh:35:0,
from hb-ot-map-private.hh:32,
from hb-ot-shape-private.hh:32,
from hb-ot-shape.cc:29:
hb-object-private.hh: In constructor '_hb_object_header_t::_hb_object_header_t()':
hb-object-private.hh:97:8: error: uninitialized const member in 'struct hb_reference_count_t'
hb-object-private.hh:51:25: note: 'hb_reference_count_t::ref_count' should be initialized
In file included from hb-ot-shape.cc:33:0:
hb-set-private.hh: In constructor '_hb_set_t::_hb_set_t()':
hb-set-private.hh:37:8: note: synthesized method '_hb_object_header_t::_hb_object_header_t()' first required here
hb-ot-shape.cc: In function 'void hb_ot_shape_glyphs_closure(hb_font_t*, hb_buffer_t*, const hb_feature_t*, unsigned int, hb_set_t*)':
hb-ot-shape.cc:521:12: note: synthesized method '_hb_set_t::_hb_set_t()' first required here
2012-06-03 15:54:19 -04:00
Behdad Esfahbod
70600dbf62
Minor
2012-06-03 15:52:51 -04:00
Behdad Esfahbod
96a9ef0c9f
Remove tab character like other "zero-width" characters
...
Uniscribe does that, this make comparing results to Uniscribe
easier.
2012-06-01 13:46:26 -04:00
Behdad Esfahbod
0558d55bac
Remove hb_atomic_int_set/get()
...
We never use them in fact...
I'm just adjusting these as I better understand the requirements of
the code and the guarantees of each operation.
2012-05-28 10:46:47 -04:00
Behdad Esfahbod
bce095524b
Add hb_font_get_glyph_name() and hb_font_get_glyph_from_name()
2012-05-28 10:45:50 -04:00
Behdad Esfahbod
bc145658bd
Warn if no Unicode functions implementation is found
2012-05-28 10:45:50 -04:00
Behdad Esfahbod
a3547330fa
Cleanup atomic ops on OS X
2012-05-27 10:20:47 -04:00
Behdad Esfahbod
e4b6d503c5
Don't use atomic ops in hb_cache_t
...
We don't care about linearizability, so unprotected int read/write
are enough, no need for expensive memory barriers. It's a cache,
that's all.
2012-05-27 10:11:13 -04:00
Behdad Esfahbod
819faa0530
Minor
2012-05-27 10:09:18 -04:00
Behdad Esfahbod
303d5850ec
Fix Windows atomic get/set
...
According to:
http://msdn.microsoft.com/en-us/library/65tt87y8.aspx
MemoryBarrier() is the right macro to protect these, not _ReadBarrier()
and/or _WriteBarrier().
2012-05-27 10:01:13 -04:00
Behdad Esfahbod
29ce446d31
Add set iterator
2012-05-25 14:17:54 -04:00
Behdad Esfahbod
62c3e111fc
Add set symmetric difference
2012-05-25 13:48:00 -04:00
Behdad Esfahbod
27aba594c9
Minor
2012-05-24 15:00:01 -04:00
Behdad Esfahbod
cde1c0114b
Fix hb_atomic_int_set() implementation for HB_NO_MT
...
As pointed out by Jonathan Kew.
2012-05-24 10:46:39 -04:00
Behdad Esfahbod
ed2f1363a3
Fix substitution glyph class propagation
...
The old code was doing nothing.
Still got to find an example font+string that makes this matter, but
need this for fixing synthetic GDEF anyway.
2012-05-22 22:12:22 -04:00
Behdad Esfahbod
20fdb0f41d
Add a lock-free cache type for int->int functions
...
To be used for cmap and advance caching if desired.
2012-05-17 22:04:45 -04:00
Behdad Esfahbod
bd908b4f10
Implement hb_atomic_int_set() for OS X
2012-05-17 22:02:08 -04:00
Behdad Esfahbod
022a05ae90
Minor
2012-05-17 21:53:24 -04:00
Behdad Esfahbod
22afd66a30
Add hb_atomic_int_set() again
2012-05-17 21:23:49 -04:00
Behdad Esfahbod
4aa7258cb1
Fix type conflicts on Windows without glib
2012-05-17 21:01:04 -04:00
Behdad Esfahbod
f039e79d54
Don't use min/max as function names
...
They can be macros on some systems. Eg. mingw32.
2012-05-17 20:55:12 -04:00
Behdad Esfahbod
34961e3198
Prefer native atomic/mutex ops to glib's
2012-05-17 20:50:38 -04:00
Behdad Esfahbod
ec3ba4b96f
Move atomic ops into their own header
2012-05-17 20:30:46 -04:00
Behdad Esfahbod
1d6846db9e
[Indic] Apply vatu feature after cjct
...
Testing with old Deva spec this reduces failures.
Test sequence: U+0915,U+094D,U+0930.
2012-05-13 18:09:29 +02:00
Behdad Esfahbod
617f4ac46f
Refactor
2012-05-13 16:48:03 +02:00
Behdad Esfahbod
5e4e21fce4
Revert "[Indic] Refactoring"
...
This reverts commit 0831061efb
.
2012-05-13 16:46:08 +02:00
Behdad Esfahbod
3f18236a03
Fix more warnings
2012-05-13 16:20:10 +02:00
Behdad Esfahbod
9f377ed321
Fix more unused-var warnings
2012-05-13 16:13:44 +02:00
Behdad Esfahbod
d993e72331
Fix hb_face_set_index()
2012-05-13 16:04:36 +02:00
Behdad Esfahbod
93345edcbe
Fix warnings
2012-05-13 16:01:08 +02:00
Behdad Esfahbod
eace47b173
Minor
2012-05-13 15:54:43 +02:00
Behdad Esfahbod
99c2695759
Add accessort to buffer for current info, current pos, and prev info
2012-05-13 15:45:18 +02:00
Behdad Esfahbod
6736f3c5b0
Minor
2012-05-13 15:21:06 +02:00
Behdad Esfahbod
5df809b655
[GSUB/GPOS] Remove context_length
...
The spec doesn't say contextual matching should be done this way,
and AOTS doesn't do it either. It was inherited from old HarfBuzz.
Remove it.
2012-05-13 15:17:51 +02:00
Behdad Esfahbod
28b9d502bb
Minor
2012-05-13 15:04:00 +02:00
Behdad Esfahbod
737dded2e0
Fix compiler warnings
2012-05-12 15:40:11 +02:00
Behdad Esfahbod
7f852b644b
Fix compiler warnings
2012-05-11 23:10:31 +02:00
Behdad Esfahbod
f7e8dcfd4f
[Indic] Unbreak Devanagari
...
And this, concludes the HarfBuzz Massala Hackfest.
I like to specially thank Jonathan Kew for doing all the decription and
letting me get commit points.
2012-05-11 22:01:33 +02:00
Behdad Esfahbod
6a091df9b4
[Indic] Disambiguate sub vs post vs above matras
...
Bengali is at *just* above 5% now.
2012-05-11 21:42:27 +02:00
Behdad Esfahbod
9d0d319a4a
[Indic] Position Bengali Reph before matras
2012-05-11 21:36:32 +02:00
Behdad Esfahbod
f893672511
[Indic] Start categorizing Reph per script
2012-05-11 21:10:03 +02:00
Behdad Esfahbod
a913b024d8
[Indic] Apply 'init' feature for Bengali
...
Error down from 20% to 7%.
2012-05-11 20:59:26 +02:00
Behdad Esfahbod
eed903b164
[Indic] Refactor for the arrival of 'init' feature
...
Yep, on Bengali now!
2012-05-11 20:50:53 +02:00
Behdad Esfahbod
18c06e189b
[Indic] Add Uniscribe bug feature for dotted circle
...
For dotted-circle independent clusters, Uniscribe does no Reph shaping
for the exact sequence Ra+Halant+25CC. Which also is the only possible
sequence with 25CC at the end.
2012-05-11 20:02:14 +02:00
Behdad Esfahbod
0831061efb
[Indic] Refactoring
2012-05-11 19:07:58 +02:00
Behdad Esfahbod
7ea58db311
Minor
2012-05-11 18:58:57 +02:00
Behdad Esfahbod
9c09928989
[Indic] Allow multiple Consonants in Vowel/NBSP syllables
...
Uniscribe allows multiple Halant+Consonant after a Vowel.
Tests:
↦ * U+0905,U+094D,U+092B,U+094D,930,94d,930
2012-05-11 18:46:35 +02:00
Behdad Esfahbod
8c0aa486f3
[Indic] Allow two Nuktas per consonant
...
Uniscribe allows up to two nuktas per consonant and one per matra. It does so
indepent of whether the consonant already has a nukta in it. Tests:
* U+0916,U+093C,U+0941
* U+0959,U+093C,U+0941
* U+0916,U+093C,U+093C,U+0941
* U+0959,U+093C,U+093C,U+0941
* U+0916,U+093C,U+093C,U+093C,U+0941
* U+0959,U+093C,U+093C,U+093C,U+0941
* 915,93c,93c,,94d,U+0916,U+093C,U+093C,U+093e,93c,93c
2012-05-11 18:13:42 +02:00
Behdad Esfahbod
3399a06e70
[Indic] Fix U+0952 and similar classification to match Uniscribe
...
See comments.
2012-05-11 17:54:26 +02:00
Behdad Esfahbod
11aa3ef18d
[Indic] Treat U+0951..U+0954 all similar to U+0952
2012-05-11 17:30:48 +02:00
Behdad Esfahbod
5f131d3226
[GSUB/GPOS/Indic] Apply GSUB/GPOS within syllables only
...
This does not apply to the context matchings.
This regresses tests right now. And we are not sure whether this is
the right thing to do for GPOS. But we'll figure out.
2012-05-11 17:29:40 +02:00
Behdad Esfahbod
8fd83aaf6e
[GSUB/GPOS] Fix wrong buffer access in backward skippy mask matching
2012-05-11 17:18:37 +02:00
Behdad Esfahbod
ff24d1081a
[Indic] Don't use syllable serial value 0
2012-05-11 17:07:08 +02:00
Behdad Esfahbod
892eb78782
[Indic] Implement Uniscribe Reph+Matra+Halant bug feature
2012-05-11 16:54:40 +02:00
Behdad Esfahbod
67ea29af49
[Indic] Add example of different Uniscribe behavior
2012-05-11 16:51:23 +02:00
Behdad Esfahbod
ebe29733d4
[Indic] Add runtime Uniscribe bug compatibility mode!
...
Enable by setting envvar:
HB_OT_INDIC_OPTIONS=uniscribe-bug-compatible
Plus, LeftMatra+Halant "feature".
2012-05-11 16:43:12 +02:00
Behdad Esfahbod
616e692e29
[Indic] Add #define UNISCRIBE_BUG_COMPATIBLE 1
2012-05-11 16:25:02 +02:00
Behdad Esfahbod
6782bdae3b
[Indic] Fix Left Matra + Halant reordering
...
As can be seen in: U+092B,U+093F,U+094D
2012-05-11 16:23:43 +02:00
Behdad Esfahbod
3c2ea9481b
Minor
2012-05-11 16:23:38 +02:00
Behdad Esfahbod
203d71069c
[GSUB/GPOS] Check all glyph masks when matching input
2012-05-11 16:01:44 +02:00
Behdad Esfahbod
668c6046c1
[Indic] Apply Reph mask to all POS_REPH glyphs
...
Needed for upcoming changes to GSUB/GPOS mask matching.
2012-05-11 15:34:13 +02:00
Behdad Esfahbod
4be46bade2
[Indic] Fix state machine to backtrack
2012-05-11 14:39:01 +02:00
Behdad Esfahbod
cee7187447
[Indic] Move syllable tracking from Indic to generic layer
...
This is to incorporate it into GSUB/GPOS processing.
2012-05-11 11:41:39 +02:00
Behdad Esfahbod
3bf27a9f0e
[Indic] Disable conjuncts when a ZWJ happens
...
Not that the code makes any difference since the presence of ZWJ itself
causes the ligature to fail to match anyway.
2012-05-11 11:17:23 +02:00
Behdad Esfahbod
c6d904d67d
[Indic] Fix bitops typo!
...
Another 1000 down!
2012-05-11 11:07:40 +02:00
Behdad Esfahbod
55fe2cf79b
Make APPLY debug output print current index and codepoint
...
Yay!
2012-05-11 03:56:33 +02:00
Behdad Esfahbod
7bd2b04fea
Minor
2012-05-11 03:40:58 +02:00
Behdad Esfahbod
cf26510dbb
Some more...
...
Done. I promise.
2012-05-11 03:35:08 +02:00
Behdad Esfahbod
9659523ca3
More beauty in debug output!
2012-05-11 03:33:36 +02:00
Behdad Esfahbod
cf26e88a5a
Finish off debug output beautification
2012-05-11 03:16:57 +02:00
Behdad Esfahbod
d7bba01a35
Only print class name in debug output if there's one available
2012-05-11 02:46:26 +02:00
Behdad Esfahbod
85f73fa8da
Only printout class name in tracing, if one is available
...
Makes debug output much more pleasant.
2012-05-11 02:40:42 +02:00
Behdad Esfahbod
98619ce4fa
Minor
2012-05-11 02:34:06 +02:00
Behdad Esfahbod
acea183e98
Add return annotation for APPLY
2012-05-11 02:33:11 +02:00
Behdad Esfahbod
5ccfe8e215
/Minor/
2012-05-11 02:19:41 +02:00
Behdad Esfahbod
0ab8c86217
Annotate SANITIZE return values
...
More to come, for APPLY, CLOSURE, etc.
2012-05-11 02:11:52 +02:00
Behdad Esfahbod
829e814ff3
Minor
2012-05-11 00:52:16 +02:00
Behdad Esfahbod
6eec6f406d
Code reshuffling
2012-05-11 00:50:38 +02:00
Behdad Esfahbod
1e08830b4f
Beautify debug output
2012-05-11 00:43:57 +02:00
Behdad Esfahbod
6f45538017
More massaging trace messaging
2012-05-10 23:24:43 +02:00
Behdad Esfahbod
b5fa37cb69
Minor
2012-05-10 23:09:48 +02:00
Behdad Esfahbod
208109703c
Better trace message support infrastructure
...
We have varargs in the trace interface now. To be used soon...
2012-05-10 23:06:58 +02:00
Behdad Esfahbod
02b2922fbf
[Indic] Towards better Reph positioning
...
Fixed for Deva cases with two full-form consonants. Failures **way** down.
Not much left to go :-).
2012-05-10 21:44:50 +02:00
Behdad Esfahbod
74e54cf446
[Indic] Add Ra back for scripts without Reph
...
We now check that the 'rphp' table exists before forming Reph, so
we don't need to comment out Ra for those scripts.
2012-05-10 21:22:58 +02:00
Behdad Esfahbod
2b70df5cc0
[Indic] Add note re Uniscribe clusters
2012-05-10 18:38:22 +02:00
Behdad Esfahbod
21d2803133
[Indic] Do clustering like Uniscribe does
...
Hindi Wikipedia failures down to 6639 (0.938381%)!
2012-05-10 18:34:34 +02:00
Behdad Esfahbod
8df5636968
[Indic] Reorder Reph to before the Halant after Matras
...
Uniscribe doesn't do it, but we want to do as it gives the Reph the
opportunity to interact with the Matras. Test with mangal for example.
Sequence: <0930,094d,0915,094b,094d>
In test suite already.
2012-05-10 15:41:04 +02:00
Behdad Esfahbod
daf3234bdc
[Indic] Don't clear the mask for Reph
...
This was removing the mandatory global 1 bit in the mask and hence
disabling GPOS for Reph!
2012-05-10 15:28:27 +02:00
Behdad Esfahbod
7708ee23cb
[Indic] Improve Left Matra repositioning
...
Move its dependents too.
2012-05-10 14:48:25 +02:00
Behdad Esfahbod
dbb105883c
[Indic] Do Reph repositioning in final reordering like the spec says
...
This introduced a failure, which we tracked down to a test case like this:
U+092E,U+094B,U+094D,U+0930
The final character is a Ra that should be put in a syllable of it's
own. And we do. But it will interact with the Halant before it. So
now we finally are convinced that we have to limit features to syllable
boundaries. That's coming after lunch!
2012-05-10 13:45:52 +02:00
Behdad Esfahbod
4705a70269
Minor
2012-05-10 13:09:08 +02:00
Behdad Esfahbod
4ac9e98d9d
[Indic] Reorder left matras to be closer to base
2012-05-10 12:53:53 +02:00
Behdad Esfahbod
1a1fa8c655
[Indic] Treat the standalone cluster case reusing the consonant logic
2012-05-10 12:21:30 +02:00
Behdad Esfahbod
190eb31a16
[Indic] Minor
2012-05-10 12:21:30 +02:00
Behdad Esfahbod
c5306b6861
[Indic] Handle Vowel syllables
...
Reusing the consonant logic!
2012-05-10 12:21:30 +02:00
Behdad Esfahbod
6d8e0cb74c
[Indic] Simplify Reph logic
2012-05-10 11:41:51 +02:00
Behdad Esfahbod
3d25079f8d
[Indic] Don't form Reph is Ra is the only consonant in the syllable
2012-05-10 11:37:42 +02:00
Behdad Esfahbod
b99d63ae11
[Indic] Increase max syllable length
...
20 was way too low, one could hit a syllable with 7ish consonants with it.
2012-05-10 11:32:52 +02:00
Behdad Esfahbod
a391ff50b9
[Indic] Adjust base after sorting
2012-05-10 11:31:20 +02:00
Behdad Esfahbod
d3637edb24
[Indic] Don't return for long syllables. Just not sort.
2012-05-10 10:51:38 +02:00
Behdad Esfahbod
dfa0cade7f
Fix Uniscribe clusters with multiple items
2012-05-09 19:10:07 +02:00
Behdad Esfahbod
86e5dd386a
[Indic] Don't give up syllable parsing upon junk
2012-05-09 18:57:37 +02:00
Behdad Esfahbod
ef24cc8c8e
[Indic] Towards multi-cluster syllables and final reordering
2012-05-09 18:10:20 +02:00
Behdad Esfahbod
a9844d41c6
Combine lig_id and lig_comp into one byte, to free up one for Indic
2012-05-09 17:53:13 +02:00
Behdad Esfahbod
92332e5116
Minor
2012-05-09 17:40:00 +02:00
Behdad Esfahbod
dbccf87eef
[Indic] Make room for more reordering positions
2012-05-09 17:24:39 +02:00
Behdad Esfahbod
d4480ace7f
[Indic] Improve matra vs consonant ordering
...
Another 1.5% down.
2012-05-09 15:59:47 +02:00
Behdad Esfahbod
33c92e7695
[Indic] Categorize Anudatta
2012-05-09 15:41:51 +02:00
Behdad Esfahbod
19d984edaa
[Indic] Make sure Reph jumps over all matras to the right
...
Another 12 thousand failures gone! (78 to go)
2012-05-09 15:21:13 +02:00
Behdad Esfahbod
9034641333
[Indic] Keep Vedic signs at the right too
2012-05-09 15:04:58 +02:00
Behdad Esfahbod
d1deaa2f5b
Replace zerowidth invisible chars with a zero-advance space glyph
...
Like Uniscribe does.
2012-05-09 15:04:13 +02:00
Behdad Esfahbod
49e5da1591
[indic] Keep the syllable modifier marks to the right
...
Shaping failures on Hindi Wikipedia go down from 25% to 14%!
2012-05-09 13:23:27 +02:00
Behdad Esfahbod
5b12609093
Minor
2012-05-09 12:37:27 +02:00
Behdad Esfahbod
9ce939232b
Minor
2012-05-09 12:03:09 +02:00
Behdad Esfahbod
76b3409de6
[indic] Better Reph matching
2012-05-09 11:52:32 +02:00
Behdad Esfahbod
df6d45c693
Minor
2012-05-09 11:38:31 +02:00
Behdad Esfahbod
412b91889d
[indic] Apply Indic features in order
2012-05-09 11:07:18 +02:00
Behdad Esfahbod
1ac075b227
[indic] Apply rakaar forms
...
Fixes 10% of the failures against all of Hindi Wikipedia!
2012-05-09 11:06:47 +02:00
Behdad Esfahbod
1a2a4a0078
Fix warning and build issues
...
As reported by Jonathan Kew on the list.
2012-05-05 22:38:20 +02:00
Behdad Esfahbod
a5e39fed85
Minor
2012-04-25 00:14:46 -04:00
Behdad Esfahbod
1827dc208c
Add hb_ot_shape_glyphs_closure()
...
Experimental API for now.
2012-04-24 16:56:37 -04:00
Behdad Esfahbod
bb09f0ec10
Minor
2012-04-24 16:02:12 -04:00
Behdad Esfahbod
29a7e306e3
Minor
2012-04-24 16:01:30 -04:00
Behdad Esfahbod
6c6ccaf575
Add a few more set operations
...
TODO: Tests for hb_set_t.
2012-04-24 14:23:01 -04:00
Behdad Esfahbod
5caece67ab
Make closure() return void
2012-04-23 23:03:12 -04:00
Behdad Esfahbod
0b08adb353
Add hb_set_t
2012-04-23 22:44:59 -04:00
Behdad Esfahbod
5b93e8d94f
Update copyright headers
2012-04-23 22:26:27 -04:00
Behdad Esfahbod
6a9be5bd35
Rename hb_glyph_map_t to hb_set_t
2012-04-23 22:23:17 -04:00
Behdad Esfahbod
a4385f0b0a
Improve clustering
2012-04-23 22:20:14 -04:00
Behdad Esfahbod
8e3715f8a1
Minor
2012-04-23 22:18:54 -04:00
Behdad Esfahbod
d2984a241e
Add map->substitute_closure()
2012-04-23 17:21:14 -04:00
Behdad Esfahbod
31081f7390
Implement closure() for Context and ChainContext lookups
2012-04-23 16:54:58 -04:00
Behdad Esfahbod
c64ddab3c3
Flesh out closure() for GSUB
...
The GSUBGPOS part still missing.
2012-04-23 15:28:35 -04:00
Behdad Esfahbod
0da132bde4
Fix Coverage iters
2012-04-23 14:21:33 -04:00
Behdad Esfahbod
3e32cd9570
Minor
2012-04-23 13:22:50 -04:00
Behdad Esfahbod
650ac00da3
Minor refactoring
2012-04-23 13:17:09 -04:00
Behdad Esfahbod
f94b0aa646
Add "closure" operation stubs to GSUB
...
Filling in.
2012-04-23 13:04:38 -04:00
Behdad Esfahbod
7d50d50263
Add Coverage iterators
2012-04-23 13:04:05 -04:00
Behdad Esfahbod
3ed4634ec3
Add Indic inspection tool
2012-04-19 22:35:01 -04:00
Behdad Esfahbod
a06411ecf9
Minor matra renumbering
...
Should have no visible effect.
2012-04-19 22:28:25 -04:00
Behdad Esfahbod
36608941f3
Add GSUB "would_apply" API
...
To be used in the Indic shaper later. Unused for now.
2012-04-19 22:21:38 -04:00