UCDN was ~120kb of data. New implementatoin is 69kb in default builds,
and 49kb if built with HB_OPTIMIZE_SIZE or __OPTIMIZE_SIZE__. The
latter automatically enabled if built with -Os or -Oz.
There's room to shave off another 10kb or 20kb. That will follow later.
Fixes https://github.com/harfbuzz/harfbuzz/issues/1652
Fedora upgraded to ragel 7, which is buggy if char is signed.
Switching to -G2 output fails with sign-compare error:
../../src/hb-buffer-deserialize-json.hh:107:12: error: comparison of integer expressions of different signedness: ‘unsigned int’ and ‘const char’ [-Werror=sign-compare]
if ( 9u <= ( (*( p))) && ( (*( p))) <= 13u ) {
~~~^~~~~~~~~~~~~
Switching to -T1 for now. It actually results in smaller code,
at the expense of some binary searching instead of flat tables.
In the not distant future, we might actually generate two different
outputs and choose between depending on size-optimize options.
Fixes https://github.com/harfbuzz/harfbuzz/issues/1708
The new script, gen-tag-table.py, generates `ot_languages` automatically
from the [OpenType language system tag registry][ot] and the [IANA
Language Subtag Registry][bcp47] with some manual modifications. If an
OpenType tag maps to a BCP 47 macrolanguage, all the macrolanguage's
individual languages are mapped to the same OpenType tag, except for
individual languages with their own OpenType mappings. Deprecated
BCP 47 tags are canonicalized.
[ot]: https://docs.microsoft.com/en-us/typography/opentype/spec/languagetags
[bcp47]: https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry
Some OpenType tags correspond to multiple ISO 639 codes. The mapping
from ISO 639 codes lists OpenType tags in priority order, such that more
specific or more likely tags appear first.
Some OpenType tags have no corresponding ISO 639 code in the registry so
their mappings use BCP 47 subtags besides the language. For example, any
BCP 47 tag with a fonipa variant subtag is mapped to 'IPPH', and 'IPPH'
is mapped back to und-fonipa.
Other OpenType tags have no corresponding ISO 639 code because it is not
clear what they are for. HarfBuzz just ignores these tags.
One such ignored tag is 'ZHP ' (Chinese Phonetic). It probably means
zh-Latn. However, it is used in Microsoft JhengHei and Microsoft YaHei
with the script tag 'hani', implying that it is not a romanization
scheme after all. It would be simple enough to add this mapping to
gen-tag-table.py once a definitive mapping is determined.
The manual modifications are mainly either obvious mappings that the
OpenType registry omits or mappings for compatibility with previous
versions of HarfBuzz. Some of the old mappings were discarded, though,
for homophonous language names. For example, OpenType maps 'KUI ' to
kxu; previous versions of HarfBuzz also mapped it to kvd, because kvd
and kxu both happen to be called "Kui".
gen-tag-table.py also generates a function to convert multi-subtag tags
like el-polyton and zh-HK to OpenType tags, replacing `ot_languages_zh`
and the hard-coded list of special cases in `hb_ot_tags_from_language`.
It also generates a function to convert OpenType tags to BCP 47,
replacing the hard-coded list of special cases in
`hb_ot_tag_to_language`.
Automake has this stupid behavior where if your Makefile.am has
syntactic error, it can get to a state that make succeeds but just
ignores broken Makefile.am. Ouch.
Before 1.7.5, we were setting -fno-exceptions etc on CXXFLAGS. In 1.7.6
we set it as CPPFLAGS. Try fixing. Also, I'm fairly sure it's safe to
set these unconditionally.
Fixes https://github.com/harfbuzz/harfbuzz/issues/880 (or so I hope)