If compiler doesn't inline StructAtOffset, this was an error since we
only disable cast-align at call-site. So, move the cast out.
../src/hb-machinery.hh: In instantiation of 'const Type& StructAtOffset(const void*, unsigned int) [with Type = unsigned int]':
../src/hb-font.cc:146:85: required from here
../src/hb-machinery.hh:63:12: error: cast from 'const char*' to 'const unsigned int*' increases required alignment of target type [-Werror=cast-align]
{ return * reinterpret_cast<const Type*> ((const char *) P + offset); }
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../src/hb-machinery.hh: In instantiation of 'Type& StructAtOffset(void*, unsigned int) [with Type = unsigned int]':
../src/hb-font.cc:147:79: required from here
../src/hb-machinery.hh:66:12: error: cast from 'char*' to 'unsigned int*' increases required alignment of target type [-Werror=cast-align]
{ return * reinterpret_cast<Type*> ((char *) P + offset); }
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Now that we have get_h_advances() and get_nominal_glyphs() implemented, the
overhead of doing a proper atomic load would be once per run, NOT once per
glyph. So, no need to pre-load the tables to avoid that overhead.
As such, hb_ot_font_set_funcs() has become really cheap. Can *finally* make
it be default font functions on all newly created fonts!
Some more measurable speedup. The recent commits' speedups are as follows:
Testing with Roboto, ****when disabling kern and liga****:
Before:
FT --features=-kern,-liga
user↦ 0m0.521s
OT --features=-liga,-kern
user↦ 0m0.568s
After:
FT --features=-liga,-kern
user↦ 0m0.428s
OT --features=-liga,-kern
user↦ 0m0.470s
So, 17% speedup.
Note that FT callbacks are faster than OT these days since we added an advance
cache to FT. I don't think the difference is enough to justify adding a cache
to OT.
When not disabling kern, the thing is three times slower, so the speedups
are three times less impressive... Still, 5% not bad for a codebase that I
otherwise thought is optimized out.
Note that, because of this and other optimiztions in our main shaper,
disabling kern and liga, the OT shaper is now *faster* than the fallback
shaper. So, that's my recommendation to clients that need the absolute
fastest...