* hb-buffer.h: Mark hb_buffer_diff() for export
This will fix the tools builds on Visual Studio, as the symbol is used
by the tools.
* build: Adapt NMake Makefiles for GLib 2.53.4 or later
glib-mkenums was ported from a PERL script to a Python script, so we
need to update how we generate the enum sources for HarfBuzz-GObject in
the NMake builds. Let this be known in the build documentation for MSVC
builds.
One of the problems with the underlying cmd.exe that the NMake Makefiles
run on is that shebang lines are not recognized, so we need to to test
run the script with Python and see whether it succeeded by outputing a
source file that is larger than 0 in file size (since running the PERL
version of the script will clearly fail and cause an empty file to be
created).
If it succeeds, we then run a small Python utility script that makes the
necessary string replacements, and we are done. If that fails, then we
run the glib-mkenums script with PERL, and do the replacements with the
PERL one-liners as we did before.
We need to make replace.py use latin-1 encoding when using Python 3.x to
cope with the copyright sign that is in the generated enum sources.
We are going to implement Unicode Arabic Mark Ordering Algorithm:
http://www.unicode.org/reports/tr53/tr53-1.pdf
which will reorder marks out of their sorted ccc order. Adjust
normalizer to stop combining as soon as dangerous ordering is
detected.
Apparently a base glyph can also become an attached component of a
ligature if the ligature-forming lookup used IgnoreBase. This was
being confused with a non-first component of a MultipleSubst and
hence not matched for mark-attachment. Tweak test to fix.
Fixes https://github.com/behdad/harfbuzz/issues/543
Followup to 8b2c94c43f
Allow matching sequences of marks attached to different ligatures,
as supposedly the base of the subsequent marks were already jumped
over.
New Indic numbers are:
BENGALI: 353725 out of 354188 tests passed. 463 failed (0.130722%)
DEVANAGARI: 707307 out of 707394 tests passed. 87 failed (0.0122987%)
GUJARATI: 366355 out of 366457 tests passed. 102 failed (0.0278341%)
GURMUKHI: 60729 out of 60747 tests passed. 18 failed (0.0296311%)
KANNADA: 951201 out of 951913 tests passed. 712 failed (0.0747968%)
KHMER: 299071 out of 299124 tests passed. 53 failed (0.0177184%)
MALAYALAM: 1048136 out of 1048334 tests passed. 198 failed (0.0188871%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271662 out of 271847 tests passed. 185 failed (0.068053%)
TAMIL: 1091754 out of 1091754 tests passed. 0 failed (0%)
TELUGU: 970555 out of 970573 tests passed. 18 failed (0.00185457%)
Before 71c0a1429d GURMUKHI used to be at 15,
because Uniscribe seems to allow this character standalone, but that looks
wrong.
If two marks want to ligate and they belong to different components of the
same ligature glyph, and said ligature glyph is to be ignored according to
mark-filtering rules, then allow.
Example Burmese senquence:
U+1004,U+103A,U+1039,U+101B,U+103D,U+102D
Test font provided by Norbert Lindenberg.
Fixes https://github.com/behdad/harfbuzz/issues/545