Optimize harfbuzz big integer conversions

Profiling showed that type conversions were adding considerable cycles in time spent doing text shaping. The idea is to optimize it using native processor instructions to help Blink layout performance. Doing further investigation revelead that compilers may not use the proper instruction on ARM 32bits builds (i.e. REV16). One way to insure that the generated ASM was ideal for both gcc/clang was using __builtin_bswap16. Added bonus is that we no longer need to test for CPU architecture.
2018-11-20 14:41:19 -08:00 · 2018-11-20 14:41:19 -08:00 · 4a719a7f4c
parent 064f703c7a
commit 4a719a7f4c
1 changed files with 4 additions and 0 deletions
--- a/src/hb-machinery.hh
+++ b/src/hb-machinery.hh
@ -691,6 +691,10 @@ struct BEInt<Type, 2>
  }
  inline operator Type (void) const
  {
+#if defined(__GNUC__) || defined(__clang__)
+    struct __attribute__((packed)) packed_uint16_t { uint16_t v; };
+    return __builtin_bswap16(((packed_uint16_t *) this)->v);
+#endif
    return (v[0] <<  8)
         + (v[1]      );
  }