Previously, we were NOT actually recomposing Hangul Jamo. We do now.
The two lines in:
test/shaping/texts/in-tree/shaper-default/script-hangul/misc/misc.txt
Now render the same with the UnDotum.ttf font. Previously the second
linle was rendering boxes.
We can also start applying OpenType Jamo features later. At this time,
I have no idea how the 'ljmo', 'vjmo', 'tjmo' features are supposed to
work. Maybe someone can explain them to me?
As requested by Jonathan Kew.
We need to devise a better mechanism to choose which scripts to
pass through the Indic shaper. Moreover, currently we are storing
data for some scripts in the Indic shaper that are not even going
through that shaper. Need to find a better way...
Makes number of failures against Uniscribe with hi_IN dictionary from
OO.o to go down from 6334 to 4290. Not bad for a one-line change!
Mozilla Bug 729626 - ASAN: heap-buffer-overflow HTML
So, apparently there's no atomic int 'get' method on Apple. You have to
add(0) to get. And that's not const-friendly. So switch inert-object
checking to a non-atomic get. This, however, is safe, and a negligible
performance boost too.
Patch and description from Jonathan Kew:
It turns out that some legacy Thai fonts provide OpenType substitution
features to implement mark positioning, but (incorrectly) put those
features/lookups under the 'latn' script tag instead of using 'thai' (or
possibly 'DFLT'). See
https://bugzilla.mozilla.org/show_bug.cgi?id=719366 for an example and
more detailed description.
Although this is really a font bug, I suggest that we could improve the
rendering of such fonts by looking for the 'latn' as a fallback if
neither the requested script nor "default" is found in
hb_ot_layout_table_choose_script. Suggested patch against harfbuzz
master is attached.
This does _not_ affect the other kind of legacy Thai font, where custom
code to support vendor-specific PUA codepoints would be needed. I'm not
keen to go down that path; IMO, such fonts should be ruthlessly stamped
out in favour of standards-based solutions. :)
JK
In hb_ot_tag_from_language(), if first component of an unknown
language is three letters long, use it directly as OpenType language
tag (after case conversion and padding).
Still not sure about:
1) Case. We pass lowercase for now. Would be nice if graphite was
uppercase 3letter like OpenType,
2) Padding. IMO, tag padding is always with spaces, but Martin was
talking about NUL bytes.
Can be -1 for NUL-terminated string. This is useful for passing parts
of a larger string to a function without having to copy or modify the
string first.
Affected functions:
hb_tag_t hb_tag_from_string()
hb_direction_from_string()
hb_language_from_string()
hb_script_from_string()
As reported by Khaled on the list:
"After the introduction of canonical reordering of combining marks
(commit 34c22f8), I'm no longer able to do mark/mark substitution or
positioning for mark sequences that involve shadda as a first mark (or
most interesting sequences at least).
"After some digging, it turned out that shadda have a ccc=33 while most
Arabic marks that combine with it have a lower ccc value, which results
in the shadda being reordered after the other mark which,
unsurprisingly, breaks my contextual substitution and mkmk anchors."
See:
http://unicode.org/faq/normalization.html#8http://unicode.org/faq/normalization.html#9
For two reasons:
1. User can always call hb_buffer_pre_allocate() themselves, and
2. Now we do a pre_alloc in add_utfX anyway, so the total number of
reallocs is limited to a small number (~3) anyway. This just makes the
API cleaner.
According to Peter Constable this is indeed what Uniscribe has been
doing for years.
Mozilla Bug 667166 - wrong shape of letter when it comes at the end of
word in the arabic version of Firefox 5.0