Commit Graph

13948 Commits

Author SHA1 Message Date
Behdad Esfahbod f5d619be79 [ot-tags] Further gate the slow complex case, and add more tests
Part of https://github.com/harfbuzz/harfbuzz/issues/3591

Still 'zh-trad' is the slowest case.

--------------------------------------------------------------------------------------------------
Benchmark                                                        Time             CPU   Iterations
--------------------------------------------------------------------------------------------------
BM_hb_ot_tags_from_script_and_language/COMMON zh_trad          136 ns          136 ns      5107838
BM_hb_ot_tags_from_script_and_language/COMMON ab_abcd          115 ns          115 ns      6103104
BM_hb_ot_tags_from_script_and_language/COMMON ab_abc          25.4 ns         25.3 ns     27674482
BM_hb_ot_tags_from_script_and_language/COMMON abcdef_XY       20.2 ns         20.1 ns     34795719
BM_hb_ot_tags_from_script_and_language/COMMON abcd_XY         19.4 ns         19.3 ns     36390401
BM_hb_ot_tags_from_script_and_language/COMMON cxy_CN          33.5 ns         33.4 ns     20998939
BM_hb_ot_tags_from_script_and_language/COMMON exy_CN          25.1 ns         25.0 ns     27705832
BM_hb_ot_tags_from_script_and_language/COMMON zh_CN           34.2 ns         34.1 ns     20564356
BM_hb_ot_tags_from_script_and_language/COMMON en_US           15.5 ns         15.5 ns     45032204
BM_hb_ot_tags_from_script_and_language/LATIN en_US            15.9 ns         15.8 ns     44412379
BM_hb_ot_tags_from_script_and_language/COMMON none            4.72 ns         4.71 ns    149101665
BM_hb_ot_tags_from_script_and_language/LATIN none             4.72 ns         4.70 ns    149254498
2022-05-18 11:04:52 -06:00
Behdad Esfahbod 9c64bda21d [ot-tag] Whitespace 2022-05-17 17:31:18 -06:00
Behdad Esfahbod 3df8017e9b [ot-tag] Optimize subtag_matches() more 2022-05-17 17:29:39 -06:00
Behdad Esfahbod b231fc2dbc [perf/benchmark-ot] Add a couple more test cases 2022-05-17 17:03:37 -06:00
Behdad Esfahbod 3524b14fa0 [perf/benchmark-ot] Add a couple more test cases 2022-05-17 17:02:48 -06:00
Behdad Esfahbod 7f6e8c5536 [ot-tags] Optimize subtag_matches() further
Part of https://github.com/harfbuzz/harfbuzz/issues/3591

Comparing before to after
Benchmark                                                               Time             CPU      Time Old      Time New       CPU Old       CPU New
----------------------------------------------------------------------------------------------------------------------------------------------------
BM_hb_ot_tags_from_script_and_language/COMMON abcd_XY                -0.3371         -0.3371            71            47            71            47
2022-05-17 16:58:35 -06:00
Behdad Esfahbod 27c11405a2 [ot-tag] Optimize subtag_matches
Part of https://github.com/harfbuzz/harfbuzz/issues/3591
2022-05-17 16:51:51 -06:00
Behdad Esfahbod a07d818597 [ot-tag] Add a likely() to the cache hit case 2022-05-17 16:46:10 -06:00
Behdad Esfahbod 0ff5d36cd4 [perf/benchmark-ot] Fix benchmark
Part of https://github.com/harfbuzz/harfbuzz/issues/3591

Ouch!

These are the current numbers:

------------------------------------------------------------------------------------------------
Benchmark                                                      Time             CPU   Iterations
------------------------------------------------------------------------------------------------
BM_hb_ot_tags_from_script_and_language/COMMON abcd_XY       78.0 ns         77.7 ns      8917912
BM_hb_ot_tags_from_script_and_language/COMMON zh_CN         44.9 ns         44.8 ns     15475318
BM_hb_ot_tags_from_script_and_language/COMMON en_US         17.6 ns         17.5 ns     39812340
BM_hb_ot_tags_from_script_and_language/LATIN en_US          18.2 ns         18.1 ns     38356204
BM_hb_ot_tags_from_script_and_language/COMMON none          4.76 ns         4.74 ns    148746131
BM_hb_ot_tags_from_script_and_language/LATIN none           4.73 ns         4.71 ns    148421349
2022-05-17 16:38:19 -06:00
Behdad Esfahbod dfca47f419 [ot-tag] Cache last bsearch result
Part of https://github.com/harfbuzz/harfbuzz/issues/3591

Humm. Looks like not all of the fat is bsearch overhead now. I cached
the last bsearch result, but most of the time is still there. I'm
baffled.

Before:
------------------------------------------------------------------------------------------------
Benchmark                                                      Time             CPU   Iterations
------------------------------------------------------------------------------------------------
BM_hb_ot_tags_from_script_and_language/COMMON abcd_XY       8.08 ns         8.05 ns     84500482
BM_hb_ot_tags_from_script_and_language/COMMON zh_CN         42.2 ns         42.1 ns     16722006
BM_hb_ot_tags_from_script_and_language/COMMON en_US         16.1 ns         16.0 ns     43461527
BM_hb_ot_tags_from_script_and_language/LATIN en_US          16.5 ns         16.5 ns     42448505
BM_hb_ot_tags_from_script_and_language/COMMON none          4.34 ns         4.33 ns    161290530
BM_hb_ot_tags_from_script_and_language/LATIN none           4.34 ns         4.33 ns    162339799

After:
------------------------------------------------------------------------------------------------
Benchmark                                                      Time             CPU   Iterations
------------------------------------------------------------------------------------------------
BM_hb_ot_tags_from_script_and_language/COMMON abcd_XY       8.13 ns         8.11 ns     80438134
BM_hb_ot_tags_from_script_and_language/COMMON zh_CN         40.0 ns         39.9 ns     17487939
BM_hb_ot_tags_from_script_and_language/COMMON en_US         12.7 ns         12.7 ns     55124394
BM_hb_ot_tags_from_script_and_language/LATIN en_US          13.1 ns         13.0 ns     53660125
BM_hb_ot_tags_from_script_and_language/COMMON none          4.61 ns         4.60 ns    151394104
BM_hb_ot_tags_from_script_and_language/LATIN none           4.70 ns         4.68 ns    150402847
2022-05-17 16:21:02 -06:00
Behdad Esfahbod 909f00ac6e [ot-tags] Further speed up language bsearch()
Using an integer tag to bsearch, instead of string.

Part of: https://github.com/harfbuzz/harfbuzz/issues/3591

Before:
------------------------------------------------------------------------------------------------
Benchmark                                                      Time             CPU   Iterations
------------------------------------------------------------------------------------------------
BM_hb_ot_tags_from_script_and_language/COMMON abcd_XY       8.11 ns         8.08 ns     87067795
BM_hb_ot_tags_from_script_and_language/COMMON zh_CN         53.6 ns         53.5 ns     13042418
BM_hb_ot_tags_from_script_and_language/COMMON en_US         24.2 ns         24.1 ns     29052731
BM_hb_ot_tags_from_script_and_language/LATIN en_US          24.4 ns         24.3 ns     28736769
BM_hb_ot_tags_from_script_and_language/COMMON none          4.43 ns         4.41 ns    160370413
BM_hb_ot_tags_from_script_and_language/LATIN none           4.35 ns         4.34 ns    160578191

After:
------------------------------------------------------------------------------------------------
Benchmark                                                      Time             CPU   Iterations
------------------------------------------------------------------------------------------------
BM_hb_ot_tags_from_script_and_language/COMMON abcd_XY       7.97 ns         7.95 ns     85208363
BM_hb_ot_tags_from_script_and_language/COMMON zh_CN         41.7 ns         41.6 ns     16945817
BM_hb_ot_tags_from_script_and_language/COMMON en_US         16.1 ns         16.0 ns     43613523
BM_hb_ot_tags_from_script_and_language/LATIN en_US          16.5 ns         16.4 ns     42568107
BM_hb_ot_tags_from_script_and_language/COMMON none          4.30 ns         4.29 ns    164055469
BM_hb_ot_tags_from_script_and_language/LATIN none           4.29 ns         4.27 ns    163793591
2022-05-17 15:51:41 -06:00
Behdad Esfahbod c460cf74ce [ot-tags] Cosmetic 2022-05-17 15:30:11 -06:00
Behdad Esfahbod 1c8226ed14 Fix compiler warning
On Mac compiler:

FAILED: src/libharfbuzz.0.dylib.p/hb-ot-tag.cc.o
c++ -Isrc/libharfbuzz.0.dylib.p -Isrc -I../src -I. -I.. -I/usr/local/opt/freetype/include/freetype2 -I/usr/local/Cellar/graphite2/1.3.14/include -I/usr/local/Cellar/glib/2.72.1/include/glib-2.0 -I/usr/local/Cellar/glib/2.72.1/lib/glib-2.0/include -I/usr/local/opt/gettext/include -I/usr/local/Cellar/pcre/8.45/include -Xclang -fcolor-diagnostics --coverage -pipe -Wall -Winvalid-pch -Wnon-virtual-dtor -std=c++11 -fno-rtti -O2 -g -fno-exceptions -fno-rtti -fno-threadsafe-statics -fvisibility-inlines-hidden -DHAVE_CONFIG_H -Wno-non-virtual-dtor -MD -MQ src/libharfbuzz.0.dylib.p/hb-ot-tag.cc.o -MF src/libharfbuzz.0.dylib.p/hb-ot-tag.cc.o.d -o src/libharfbuzz.0.dylib.p/hb-ot-tag.cc.o -c ../src/hb-ot-tag.cc
In file included from ../src/hb-ot-tag.cc:29:
In file included from ../src/hb.hh:481:
../src/hb-array.hh:359:14: error: missing default argument on parameter 'ds'
              Ts... ds) const
                    ^
../src/hb-ot-tag.cc:292:58: note: in instantiation of function template specialization 'hb_sorted_array_t<const LangTag>::bfind<const char *, unsigned int>' requested here
    if (hb_sorted_array (ot_languages, ot_languages_len).bfind (lang_str, &tag_idx,
                                                         ^
1 error generated.
2022-05-17 15:28:50 -06:00
Behdad Esfahbod c1f4b57c06 [ot-tags] Optimize language comparison
Now that we know both strings are of equal len of 2 or 3, optimize.

Part of https://github.com/harfbuzz/harfbuzz/issues/3591

Before:
------------------------------------------------------------------------------------------------
Benchmark                                                      Time             CPU   Iterations
------------------------------------------------------------------------------------------------
BM_hb_ot_tags_from_script_and_language/COMMON abcd_XY       8.50 ns         8.47 ns     81221549
BM_hb_ot_tags_from_script_and_language/COMMON zh_CN         79.6 ns         79.3 ns      8785804
BM_hb_ot_tags_from_script_and_language/COMMON en_US         40.0 ns         39.9 ns     17462768
BM_hb_ot_tags_from_script_and_language/LATIN en_US          39.2 ns         39.1 ns     17886793
BM_hb_ot_tags_from_script_and_language/COMMON none          4.31 ns         4.30 ns    162805417
BM_hb_ot_tags_from_script_and_language/LATIN none           4.32 ns         4.31 ns    162656688

After:
------------------------------------------------------------------------------------------------
Benchmark                                                      Time             CPU   Iterations
------------------------------------------------------------------------------------------------
BM_hb_ot_tags_from_script_and_language/COMMON abcd_XY       8.27 ns         8.24 ns     81868701
BM_hb_ot_tags_from_script_and_language/COMMON zh_CN         56.1 ns         56.0 ns     12353284
BM_hb_ot_tags_from_script_and_language/COMMON en_US         24.3 ns         24.2 ns     28955030
BM_hb_ot_tags_from_script_and_language/LATIN en_US          24.5 ns         24.4 ns     28664868
BM_hb_ot_tags_from_script_and_language/COMMON none          4.35 ns         4.34 ns    161190014
BM_hb_ot_tags_from_script_and_language/LATIN none           4.36 ns         4.34 ns    161319000
2022-05-17 15:19:40 -06:00
Behdad Esfahbod dde48d78c1 Fix compiler warning 2022-05-17 15:07:49 -06:00
Behdad Esfahbod 15be0deda0 [ot-tags] Optimize lang_matches()
Part of https://github.com/harfbuzz/harfbuzz/issues/3591

Before:
------------------------------------------------------------------------------------------------
Benchmark                                                      Time             CPU   Iterations
------------------------------------------------------------------------------------------------
BM_hb_ot_tags_from_script_and_language/COMMON abcd_XY       8.67 ns         8.64 ns     80324382
BM_hb_ot_tags_from_script_and_language/COMMON zh_CN         91.2 ns         90.9 ns      7674131
BM_hb_ot_tags_from_script_and_language/COMMON en_US         41.1 ns         41.0 ns     17174093
BM_hb_ot_tags_from_script_and_language/LATIN en_US          41.3 ns         41.2 ns     17000876
BM_hb_ot_tags_from_script_and_language/COMMON none          4.56 ns         4.55 ns    153914130
BM_hb_ot_tags_from_script_and_language/LATIN none           4.53 ns         4.52 ns    153830303

After:
------------------------------------------------------------------------------------------------
Benchmark                                                      Time             CPU   Iterations
------------------------------------------------------------------------------------------------
BM_hb_ot_tags_from_script_and_language/COMMON abcd_XY       8.24 ns         8.21 ns     84078465
BM_hb_ot_tags_from_script_and_language/COMMON zh_CN         77.5 ns         77.2 ns      9059230
BM_hb_ot_tags_from_script_and_language/COMMON en_US         38.8 ns         38.7 ns     17790692
BM_hb_ot_tags_from_script_and_language/LATIN en_US          37.6 ns         37.5 ns     18648293
BM_hb_ot_tags_from_script_and_language/COMMON none          4.50 ns         4.49 ns    155573267
BM_hb_ot_tags_from_script_and_language/LATIN none           4.49 ns         4.47 ns    156456653
2022-05-17 14:57:08 -06:00
Behdad Esfahbod 407a135baf [perf/benchmark-ot] Add one more test 2022-05-17 14:45:45 -06:00
Behdad Esfahbod dd3c858f84 [ot-tags] Speed up hb_ot_tags_from_language()
Part of https://github.com/harfbuzz/harfbuzz/issues/3591

"After that, bulk of the time I suppose is spent in binary-searching the
language table. I suggest we split the language table in 2-letter and
3-letter tags, to speed-up the vast majority of cases that are
2-letter."

benchmark-ot, before:

----------------------------------------------------------------------------------------------
Benchmark                                                    Time             CPU   Iterations
----------------------------------------------------------------------------------------------
BM_hb_ot_tags_from_script_and_language/COMMON zh_CN        112 ns          111 ns      6286271
BM_hb_ot_tags_from_script_and_language/COMMON en_US       60.6 ns         60.4 ns     11671176
BM_hb_ot_tags_from_script_and_language/LATIN en_US        61.3 ns         61.1 ns     11442645
BM_hb_ot_tags_from_script_and_language/COMMON none        4.75 ns         4.74 ns    146997235
BM_hb_ot_tags_from_script_and_language/LATIN none         4.65 ns         4.64 ns    150938747

After:

----------------------------------------------------------------------------------------------
Benchmark                                                    Time             CPU   Iterations
----------------------------------------------------------------------------------------------
BM_hb_ot_tags_from_script_and_language/COMMON zh_CN       89.5 ns         89.2 ns      7747649
BM_hb_ot_tags_from_script_and_language/COMMON en_US       38.5 ns         38.4 ns     18199432
BM_hb_ot_tags_from_script_and_language/LATIN en_US        39.0 ns         38.9 ns     18049238
BM_hb_ot_tags_from_script_and_language/COMMON none        4.53 ns         4.52 ns    154895110
BM_hb_ot_tags_from_script_and_language/LATIN none         4.54 ns         4.52 ns    154762105
2022-05-17 14:28:28 -06:00
Behdad Esfahbod 9baccb9860 [ot-tags] Speed up hb_ot_tags_from_complex_language()
Part of https://github.com/harfbuzz/harfbuzz/issues/3591

2. All the subtag_matches outside the switch match long strings (>= 6 or so).
   As such, check the tag for such length before going into any of them.

benchmark-ot, before:

----------------------------------------------------------------------------------------------
Benchmark                                                    Time             CPU   Iterations
----------------------------------------------------------------------------------------------
BM_hb_ot_tags_from_script_and_language/COMMON zh_CN        172 ns          171 ns      4083155
BM_hb_ot_tags_from_script_and_language/COMMON en_US        120 ns          119 ns      5849947
BM_hb_ot_tags_from_script_and_language/LATIN en_US         113 ns          112 ns      5840326
BM_hb_ot_tags_from_script_and_language/COMMON none        4.66 ns         4.64 ns    151396224
BM_hb_ot_tags_from_script_and_language/LATIN none         4.66 ns         4.64 ns    149019593

After:

----------------------------------------------------------------------------------------------
Benchmark                                                    Time             CPU   Iterations
----------------------------------------------------------------------------------------------
BM_hb_ot_tags_from_script_and_language/COMMON zh_CN        112 ns          112 ns      6357763
BM_hb_ot_tags_from_script_and_language/COMMON en_US       60.5 ns         60.3 ns     11475091
BM_hb_ot_tags_from_script_and_language/LATIN en_US        54.9 ns         54.8 ns     12575690
BM_hb_ot_tags_from_script_and_language/COMMON none        4.61 ns         4.59 ns    152388450
BM_hb_ot_tags_from_script_and_language/LATIN none         4.66 ns         4.64 ns    151497600
2022-05-17 13:34:34 -06:00
Behdad Esfahbod 26d906b88b [perf] Add benchmark-ot 2022-05-17 13:12:17 -06:00
Behdad Esfahbod 629fa8ee87 [perf/benchmark-font] Test Roboto as variable even though it's not 2022-05-16 17:49:36 -06:00
Behdad Esfahbod 71a0cda869 [perf/benchmark-font] Only certain fonts are variable
Don't test every font as variable.
2022-05-16 17:49:36 -06:00
Behdad Esfahbod fb413f5202 [subset/cff] Don't use bitfields for hot bools
The struct has room because of alignment, and these bools are hot.
2022-05-16 17:38:18 -06:00
Behdad Esfahbod a4d98b63ea [subset/cff1] Collect glyph-to-sid map to avoid an O(n^2) algorithm
Saves 13 for largest benchmark:

BM_subset/subset_glyphs/SourceHanSans-Regular_subset.otf/10000                    -0.1313         -0.1308            75            65            75            65

BM_subset/subset_codepoints/SourceHanSans-Regular_subset.otf/4096                 -0.1009         -0.1004            54            48            54            48
BM_subset/subset_codepoints/SourceHanSans-Regular_subset.otf/10000                -0.1067         -0.1066            70            62            69            62
2022-05-16 17:38:18 -06:00
Behdad Esfahbod b87f48e948 [cff1] get_sid() move bounds check into each implementation 2022-05-16 17:38:18 -06:00
Behdad Esfahbod e1e359b4da [cff1] Tighten up range_list_t a bit 2022-05-16 16:36:28 -06:00
Behdad Esfahbod 3fbac0942d [cff1] Lazy-load & sort glyph names
Improves subset benchmarks by up to 70% for small CFF1 subset of
non-CID fonts!

BM_subset/subset_glyphs/SourceSansPro-Regular.otf/10                              -0.7067         -0.7071             1             0             1             0
BM_subset/subset_glyphs/SourceSansPro-Regular.otf/64                              -0.4817         -0.4824             1             0             1             0
BM_subset/subset_glyphs/SourceSansPro-Regular.otf/512                             -0.1948         -0.1956             2             2             2             2
BM_subset/subset_glyphs/SourceSansPro-Regular.otf/2000                            -0.0767         -0.0761             6             6             6             6
2022-05-16 16:36:28 -06:00
Behdad Esfahbod b58bfd9818 [font] Minor move of code to silence gcc-12 warning
See mailing list discussion.
2022-05-16 11:21:45 -06:00
Behdad Esfahbod 602e0ca79d [cff] Minor restructure of struct
Surprisingly this shows tiny benchmark improvement consistently.
2022-05-16 10:14:34 -06:00
Behdad Esfahbod acdab17ed3 [cff] Cosmetic in parsed_values_t 2022-05-13 14:14:36 -06:00
Behdad Esfahbod b46c7faa9c [cff] Check buf_len, not buf
Ouch!
2022-05-13 14:02:54 -06:00
Garret Rieger 19a8db8545 [subset] fix potential integer overflow in gname_t::cmp. 2022-05-13 13:55:39 -06:00
Behdad Esfahbod 2d2f66e1a3 [cff-common] In INDEX, return empty bytes if length is zero
Before it was possible to return non-null arrayZ.
2022-05-13 13:53:17 -06:00
Behdad Esfahbod a2f132f1fc [cff] Check glyph-name's length, not arrayZ
As the latter can be non-null while still zero-length.
2022-05-13 13:49:39 -06:00
jeremiazhao dc09053f19 fix build requirements for fedora/centos in buiding document 2022-05-13 13:10:11 -06:00
Thomas Devoogdt c657c4e1f8 [meta] fix type traits on gcc 4.9 #3526
Signed-off-by: Thomas Devoogdt <thomas.devoogdt@barco.com>
2022-05-13 11:26:12 -06:00
Garret Rieger e4e053c8b3 [perf] fix typo in perf Makefile. 2022-05-13 11:25:09 -06:00
Behdad Esfahbod e61234c5f7 [vector] Add tests for move constructor/assignment 2022-05-12 13:20:10 -06:00
Behdad Esfahbod 7fa580bc4f [map] Fix map copy/move constructors to actually work
Ouch!
2022-05-12 13:05:32 -06:00
Behdad Esfahbod a09dd87ca3 [set] Fix set copy/move constructors to actually work
Ouch!
2022-05-12 12:58:07 -06:00
Behdad Esfahbod 76fc27713f [vector] Remove explicit std::move
Was confusing compilers. Let them figure it out themselves.

Makes NotoNastaliqu subsetting/1000 benchmark more than twice faster:

Benchmark                                                                       Time             CPU      Time Old      Time New       CPU Old       CPU New
------------------------------------------------------------------------------------------------------------------------------------------------------------
BM_subset/subset_glyphs/NotoNastaliqUrdu-Regular.ttf/1000                    -0.5064         -0.5065           111            55           110            55
BM_subset/subset_codepoints/NotoNastaliqUrdu-Regular.ttf/1000                -0.5494         -0.5493           132            59           131            59
2022-05-12 12:14:07 -06:00
Behdad Esfahbod c81198b5bc [set] Tweak move operators a bit
Should be equivalent.
2022-05-12 12:14:02 -06:00
Behdad Esfahbod 8dc072d20d
Merge pull request #3579 from harfbuzz/subset-retain-buffer
Subset retain buffer
2022-05-11 16:45:40 -06:00
Behdad Esfahbod 175319cd89 [gsubgpos] Clean up OT::ClassDefFormat2::intersected_class_glyphs 0 case 2022-05-11 13:47:17 -06:00
Behdad Esfahbod 137af3612b [gsubgpos] Simplify OT::ClassDefFormat2::intersected_class_glyphs() 2022-05-11 13:39:30 -06:00
Behdad Esfahbod 3261e05bdb [subset] Optimize ClassDef1::intersected_class_glyphs() for class0 2022-05-11 13:16:31 -06:00
Behdad Esfahbod c78d8ba60b [subset] Allocate same size as source table for GSUB/GPOS/name 2022-05-11 13:05:41 -06:00
Behdad Esfahbod 2e7f1ae48f [subset] Use vector.allocated size instead of tracking buf_size 2022-05-11 12:52:27 -06:00
Behdad Esfahbod f08537963b [cff-subset] Pre-alloc vector for operator decoding 2022-05-11 12:14:49 -06:00
Behdad Esfahbod 7edd54f3dd [perf/benchmark-subset] Minor cleanup 2022-05-11 12:14:49 -06:00