* hb_ot_layout_feature_get_characters
* hb_ot_layout_feature_get_name_ids
However HarfBuzz currently doesn't expose an API for retrieving the actual
information associated with NameId from the `name` table and that should be
done separately.
Was givin a dozen of:
../../src/hb-machinery.hh: In member function ‘bool AAT::ankr::sanitize(hb_sanitize_context_t*) const’:
../../src/hb-machinery.hh:307:23: warning: missed loop optimization, the loop counter may overflow [-Wunsafe-loop-optimizations]
bool ok = --this->max_ops > 0 &&
~~~~~~~~~~~~~~~~~~~~~~
this->start <= p &&
~~~~~~~~~~~~~~~~~~~
p <= this->end &&
~~~~~~~~~~~~~~~^~
(unsigned int) (this->end - p) >= len;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
I believe those are bogus, but this silences them and does not introduce
logic issues I believe.
We haven't been keeping this updated. So, while we don't expose the
symbols in the headers if HB_DISABLE_DEPRECATED is defined, we still
always build them.
Thanks to David Corbett who revamped our script and language processing
and implemented full BCP 47 support.
https://github.com/harfbuzz/harfbuzz/pull/730
New API:
+hb_ot_layout_table_select_script()
+hb_ot_layout_script_select_language()
+HB_OT_MAX_TAGS_PER_SCRIPT
+HB_OT_MAX_TAGS_PER_LANGUAGE
+hb_ot_tags_from_script_and_language()
+hb_ot_tags_to_script_and_language()
Deprecated API:
-hb_ot_layout_table_choose_script()
-hb_ot_layout_script_find_language()
-hb_ot_tags_from_script()
-hb_ot_tag_from_language()
OpenType only officially maps four ISO 639 codes to Quechua languages,
but prior versions of HarfBuzz also mapped qu to 'QUZ '. Because qu is a
macrolanguage, the mapping now applies to all individual Quechua
languages. OpenType calls 'QUZ ' "Quechua", but it really corresponds to
Cusco Quechua, so the individual Quechua languages should not all
necessarily be mapped to it.
If the second subtag of a BCP 47 tag is three letters long, it denotes
an extended language. The tag converter ignores the language subtag and
uses the extended language instead.
There are some grandfathered exceptions, which are handled earlier.
The new script, gen-tag-table.py, generates `ot_languages` automatically
from the [OpenType language system tag registry][ot] and the [IANA
Language Subtag Registry][bcp47] with some manual modifications. If an
OpenType tag maps to a BCP 47 macrolanguage, all the macrolanguage's
individual languages are mapped to the same OpenType tag, except for
individual languages with their own OpenType mappings. Deprecated
BCP 47 tags are canonicalized.
[ot]: https://docs.microsoft.com/en-us/typography/opentype/spec/languagetags
[bcp47]: https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry
Some OpenType tags correspond to multiple ISO 639 codes. The mapping
from ISO 639 codes lists OpenType tags in priority order, such that more
specific or more likely tags appear first.
Some OpenType tags have no corresponding ISO 639 code in the registry so
their mappings use BCP 47 subtags besides the language. For example, any
BCP 47 tag with a fonipa variant subtag is mapped to 'IPPH', and 'IPPH'
is mapped back to und-fonipa.
Other OpenType tags have no corresponding ISO 639 code because it is not
clear what they are for. HarfBuzz just ignores these tags.
One such ignored tag is 'ZHP ' (Chinese Phonetic). It probably means
zh-Latn. However, it is used in Microsoft JhengHei and Microsoft YaHei
with the script tag 'hani', implying that it is not a romanization
scheme after all. It would be simple enough to add this mapping to
gen-tag-table.py once a definitive mapping is determined.
The manual modifications are mainly either obvious mappings that the
OpenType registry omits or mappings for compatibility with previous
versions of HarfBuzz. Some of the old mappings were discarded, though,
for homophonous language names. For example, OpenType maps 'KUI ' to
kxu; previous versions of HarfBuzz also mapped it to kvd, because kvd
and kxu both happen to be called "Kui".
gen-tag-table.py also generates a function to convert multi-subtag tags
like el-polyton and zh-HK to OpenType tags, replacing `ot_languages_zh`
and the hard-coded list of special cases in `hb_ot_tags_from_language`.
It also generates a function to convert OpenType tags to BCP 47,
replacing the hard-coded list of special cases in
`hb_ot_tag_to_language`.
The old hb-ot-tag.cc functions, `hb_ot_tags_from_script` and
`hb_ot_tag_from_language`, are now wrappers around a new function:
`hb_ot_tags`. It converts a script and a language to arrays of script
tags and language tags. This will make it easier to add new script tags
to scripts, like 'dev3'. It also allows for language fallback chains;
nothing produces more than one language yet though.
Where the old functions return the default tags 'DFLT' and 'dflt',
`hb_ot_tags` returns an empty array. The caller is responsible for
using the default tag in that case.
The new function also adds a new private use subtag syntax for script
overrides: "x-hbscabcd" requests a script tag of 'abcd'.
The old hb-ot-layout.cc functions,`hb_ot_layout_table_choose_script` and
`hb_ot_layout_script_find_language` are now wrappers around the new
functions `hb_ot_layout_table_select_script` and
`hb_ot_layout_script_select_language`. They are essentially the same as
the old ones plus a tag count parameter.
Closes#495.
`hb_language_from_string` accepts not only ISO 639 but also BCP 47. Not
all ISO 639 codes are valid BCP 47 tags but the function does not accept
overlong language subtags anyway.
Tested using Kannada MN:
$ HB_OPTIONS=aat ./hb-shape Kannada\ MN.ttc -u 0C95,0CCd,C95,CCD
[kn_ka.virama=0+1299|kn_ka.vattu=0+115|_blank=0@-115,0+385]
$ HB_OPTIONS=aat ./hb-shape Kannada\ MN.ttc -u 0C95,0CCd,C95,CCD --features=-kern
[kn_ka.virama=0+1799|kn_ka.vattu=0+230|_blank=0+0]
I don't see the GPOS table in the font do the same. ¯\_(ツ)_/¯
Tested that Format0 works with Kannada MN font:
$ make -j5 lib -s && HB_OPTIONS=aat ./hb-shape Kannada\ MN.ttc -u 0C95,0CC2
[kn_ka=0+1000|kn_matra_uu=0@-30,0+1345]
$ make -j5 lib -s && HB_OPTIONS=aat ./hb-shape Kannada\ MN.ttc -u 0C95,0CC2 --features=-kern
[kn_ka=0+1030|kn_matra_uu=0+1375]
Note that GPOS does the same with 'dist' feature, and applies the whole difference to the
same glyph:
$ make -j5 lib -s && ./hb-shape Kannada\ MN.ttc -u 0C95,0CC2
[kn_ka=0+970|kn_matra_uu=0+1375]
$ make -j5 lib -s && ./hb-shape Kannada\ MN.ttc -u 0C95,0CC2 --features=-dist
[kn_ka=0+1030|kn_matra_uu=0+1375]
Makes our FT-backed hb_font_t safe to use from multiple threads. Still,
the underlying FT_Face should NOT be used from other threads by client
or other libraries.
Maybe I add a lock()/unlock() public API ala PangoFT2 and cairo-ft.
Maybe not.
Now that we have get_h_advances() and get_nominal_glyphs() implemented, the
overhead of doing a proper atomic load would be once per run, NOT once per
glyph. So, no need to pre-load the tables to avoid that overhead.
As such, hb_ot_font_set_funcs() has become really cheap. Can *finally* make
it be default font functions on all newly created fonts!
Some more measurable speedup. The recent commits' speedups are as follows:
Testing with Roboto, ****when disabling kern and liga****:
Before:
FT --features=-kern,-liga
user↦ 0m0.521s
OT --features=-liga,-kern
user↦ 0m0.568s
After:
FT --features=-liga,-kern
user↦ 0m0.428s
OT --features=-liga,-kern
user↦ 0m0.470s
So, 17% speedup.
Note that FT callbacks are faster than OT these days since we added an advance
cache to FT. I don't think the difference is enough to justify adding a cache
to OT.
When not disabling kern, the thing is three times slower, so the speedups
are three times less impressive... Still, 5% not bad for a codebase that I
otherwise thought is optimized out.
Note that, because of this and other optimiztions in our main shaper,
disabling kern and liga, the OT shaper is now *faster* than the fallback
shaper. So, that's my recommendation to clients that need the absolute
fastest...
Unused as of now. To be wired up to normalizer, which would remove
overhead and allow hb-ot-font initialization to become a no-op, so
we can enable it by default.
Instead of passing a CFLAG/CXXFLAG to define HB_EXTERN, define it
directly in src/hb.hh as __declspec(dllexport) extern when we are
building HarfBuzz as DLLs on Visual Studio. Define HB_INTERNAL
as nothing without defining HB_NO_VISIBILITY when building HarfBuzz as
DLLs to avoid linker errors on Visual Studio builds.
Also "install" harfbuzz-subset.dll into $(PREFIX)\bin as the
hb-subset utility will depend on that DLL at runtime, when HarfBuzz is
built as DLLs. Since it consists of private APIs that are subject to
change, we do not install its headers nor .lib file.