Behdad Esfahbod
e24797aeac
[ot-tags] Follow-up to previous commit
...
Part of https://github.com/harfbuzz/harfbuzz/issues/3591
2022-05-18 11:10:10 -06:00
Behdad Esfahbod
f5d619be79
[ot-tags] Further gate the slow complex case, and add more tests
...
Part of https://github.com/harfbuzz/harfbuzz/issues/3591
Still 'zh-trad' is the slowest case.
--------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------------------------------------------------------
BM_hb_ot_tags_from_script_and_language/COMMON zh_trad 136 ns 136 ns 5107838
BM_hb_ot_tags_from_script_and_language/COMMON ab_abcd 115 ns 115 ns 6103104
BM_hb_ot_tags_from_script_and_language/COMMON ab_abc 25.4 ns 25.3 ns 27674482
BM_hb_ot_tags_from_script_and_language/COMMON abcdef_XY 20.2 ns 20.1 ns 34795719
BM_hb_ot_tags_from_script_and_language/COMMON abcd_XY 19.4 ns 19.3 ns 36390401
BM_hb_ot_tags_from_script_and_language/COMMON cxy_CN 33.5 ns 33.4 ns 20998939
BM_hb_ot_tags_from_script_and_language/COMMON exy_CN 25.1 ns 25.0 ns 27705832
BM_hb_ot_tags_from_script_and_language/COMMON zh_CN 34.2 ns 34.1 ns 20564356
BM_hb_ot_tags_from_script_and_language/COMMON en_US 15.5 ns 15.5 ns 45032204
BM_hb_ot_tags_from_script_and_language/LATIN en_US 15.9 ns 15.8 ns 44412379
BM_hb_ot_tags_from_script_and_language/COMMON none 4.72 ns 4.71 ns 149101665
BM_hb_ot_tags_from_script_and_language/LATIN none 4.72 ns 4.70 ns 149254498
2022-05-18 11:04:52 -06:00
Behdad Esfahbod
9c64bda21d
[ot-tag] Whitespace
2022-05-17 17:31:18 -06:00
Behdad Esfahbod
3df8017e9b
[ot-tag] Optimize subtag_matches() more
2022-05-17 17:29:39 -06:00
Behdad Esfahbod
b231fc2dbc
[perf/benchmark-ot] Add a couple more test cases
2022-05-17 17:03:37 -06:00
Behdad Esfahbod
3524b14fa0
[perf/benchmark-ot] Add a couple more test cases
2022-05-17 17:02:48 -06:00
Behdad Esfahbod
7f6e8c5536
[ot-tags] Optimize subtag_matches() further
...
Part of https://github.com/harfbuzz/harfbuzz/issues/3591
Comparing before to after
Benchmark Time CPU Time Old Time New CPU Old CPU New
----------------------------------------------------------------------------------------------------------------------------------------------------
BM_hb_ot_tags_from_script_and_language/COMMON abcd_XY -0.3371 -0.3371 71 47 71 47
2022-05-17 16:58:35 -06:00
Behdad Esfahbod
27c11405a2
[ot-tag] Optimize subtag_matches
...
Part of https://github.com/harfbuzz/harfbuzz/issues/3591
2022-05-17 16:51:51 -06:00
Behdad Esfahbod
a07d818597
[ot-tag] Add a likely() to the cache hit case
2022-05-17 16:46:10 -06:00
Behdad Esfahbod
0ff5d36cd4
[perf/benchmark-ot] Fix benchmark
...
Part of https://github.com/harfbuzz/harfbuzz/issues/3591
Ouch!
These are the current numbers:
------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
------------------------------------------------------------------------------------------------
BM_hb_ot_tags_from_script_and_language/COMMON abcd_XY 78.0 ns 77.7 ns 8917912
BM_hb_ot_tags_from_script_and_language/COMMON zh_CN 44.9 ns 44.8 ns 15475318
BM_hb_ot_tags_from_script_and_language/COMMON en_US 17.6 ns 17.5 ns 39812340
BM_hb_ot_tags_from_script_and_language/LATIN en_US 18.2 ns 18.1 ns 38356204
BM_hb_ot_tags_from_script_and_language/COMMON none 4.76 ns 4.74 ns 148746131
BM_hb_ot_tags_from_script_and_language/LATIN none 4.73 ns 4.71 ns 148421349
2022-05-17 16:38:19 -06:00
Behdad Esfahbod
dfca47f419
[ot-tag] Cache last bsearch result
...
Part of https://github.com/harfbuzz/harfbuzz/issues/3591
Humm. Looks like not all of the fat is bsearch overhead now. I cached
the last bsearch result, but most of the time is still there. I'm
baffled.
Before:
------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
------------------------------------------------------------------------------------------------
BM_hb_ot_tags_from_script_and_language/COMMON abcd_XY 8.08 ns 8.05 ns 84500482
BM_hb_ot_tags_from_script_and_language/COMMON zh_CN 42.2 ns 42.1 ns 16722006
BM_hb_ot_tags_from_script_and_language/COMMON en_US 16.1 ns 16.0 ns 43461527
BM_hb_ot_tags_from_script_and_language/LATIN en_US 16.5 ns 16.5 ns 42448505
BM_hb_ot_tags_from_script_and_language/COMMON none 4.34 ns 4.33 ns 161290530
BM_hb_ot_tags_from_script_and_language/LATIN none 4.34 ns 4.33 ns 162339799
After:
------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
------------------------------------------------------------------------------------------------
BM_hb_ot_tags_from_script_and_language/COMMON abcd_XY 8.13 ns 8.11 ns 80438134
BM_hb_ot_tags_from_script_and_language/COMMON zh_CN 40.0 ns 39.9 ns 17487939
BM_hb_ot_tags_from_script_and_language/COMMON en_US 12.7 ns 12.7 ns 55124394
BM_hb_ot_tags_from_script_and_language/LATIN en_US 13.1 ns 13.0 ns 53660125
BM_hb_ot_tags_from_script_and_language/COMMON none 4.61 ns 4.60 ns 151394104
BM_hb_ot_tags_from_script_and_language/LATIN none 4.70 ns 4.68 ns 150402847
2022-05-17 16:21:02 -06:00
Behdad Esfahbod
909f00ac6e
[ot-tags] Further speed up language bsearch()
...
Using an integer tag to bsearch, instead of string.
Part of: https://github.com/harfbuzz/harfbuzz/issues/3591
Before:
------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
------------------------------------------------------------------------------------------------
BM_hb_ot_tags_from_script_and_language/COMMON abcd_XY 8.11 ns 8.08 ns 87067795
BM_hb_ot_tags_from_script_and_language/COMMON zh_CN 53.6 ns 53.5 ns 13042418
BM_hb_ot_tags_from_script_and_language/COMMON en_US 24.2 ns 24.1 ns 29052731
BM_hb_ot_tags_from_script_and_language/LATIN en_US 24.4 ns 24.3 ns 28736769
BM_hb_ot_tags_from_script_and_language/COMMON none 4.43 ns 4.41 ns 160370413
BM_hb_ot_tags_from_script_and_language/LATIN none 4.35 ns 4.34 ns 160578191
After:
------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
------------------------------------------------------------------------------------------------
BM_hb_ot_tags_from_script_and_language/COMMON abcd_XY 7.97 ns 7.95 ns 85208363
BM_hb_ot_tags_from_script_and_language/COMMON zh_CN 41.7 ns 41.6 ns 16945817
BM_hb_ot_tags_from_script_and_language/COMMON en_US 16.1 ns 16.0 ns 43613523
BM_hb_ot_tags_from_script_and_language/LATIN en_US 16.5 ns 16.4 ns 42568107
BM_hb_ot_tags_from_script_and_language/COMMON none 4.30 ns 4.29 ns 164055469
BM_hb_ot_tags_from_script_and_language/LATIN none 4.29 ns 4.27 ns 163793591
2022-05-17 15:51:41 -06:00
Behdad Esfahbod
c460cf74ce
[ot-tags] Cosmetic
2022-05-17 15:30:11 -06:00
Behdad Esfahbod
1c8226ed14
Fix compiler warning
...
On Mac compiler:
FAILED: src/libharfbuzz.0.dylib.p/hb-ot-tag.cc.o
c++ -Isrc/libharfbuzz.0.dylib.p -Isrc -I../src -I. -I.. -I/usr/local/opt/freetype/include/freetype2 -I/usr/local/Cellar/graphite2/1.3.14/include -I/usr/local/Cellar/glib/2.72.1/include/glib-2.0 -I/usr/local/Cellar/glib/2.72.1/lib/glib-2.0/include -I/usr/local/opt/gettext/include -I/usr/local/Cellar/pcre/8.45/include -Xclang -fcolor-diagnostics --coverage -pipe -Wall -Winvalid-pch -Wnon-virtual-dtor -std=c++11 -fno-rtti -O2 -g -fno-exceptions -fno-rtti -fno-threadsafe-statics -fvisibility-inlines-hidden -DHAVE_CONFIG_H -Wno-non-virtual-dtor -MD -MQ src/libharfbuzz.0.dylib.p/hb-ot-tag.cc.o -MF src/libharfbuzz.0.dylib.p/hb-ot-tag.cc.o.d -o src/libharfbuzz.0.dylib.p/hb-ot-tag.cc.o -c ../src/hb-ot-tag.cc
In file included from ../src/hb-ot-tag.cc:29:
In file included from ../src/hb.hh:481:
../src/hb-array.hh:359:14: error: missing default argument on parameter 'ds'
Ts... ds) const
^
../src/hb-ot-tag.cc:292:58: note: in instantiation of function template specialization 'hb_sorted_array_t<const LangTag>::bfind<const char *, unsigned int>' requested here
if (hb_sorted_array (ot_languages, ot_languages_len).bfind (lang_str, &tag_idx,
^
1 error generated.
2022-05-17 15:28:50 -06:00
Behdad Esfahbod
c1f4b57c06
[ot-tags] Optimize language comparison
...
Now that we know both strings are of equal len of 2 or 3, optimize.
Part of https://github.com/harfbuzz/harfbuzz/issues/3591
Before:
------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
------------------------------------------------------------------------------------------------
BM_hb_ot_tags_from_script_and_language/COMMON abcd_XY 8.50 ns 8.47 ns 81221549
BM_hb_ot_tags_from_script_and_language/COMMON zh_CN 79.6 ns 79.3 ns 8785804
BM_hb_ot_tags_from_script_and_language/COMMON en_US 40.0 ns 39.9 ns 17462768
BM_hb_ot_tags_from_script_and_language/LATIN en_US 39.2 ns 39.1 ns 17886793
BM_hb_ot_tags_from_script_and_language/COMMON none 4.31 ns 4.30 ns 162805417
BM_hb_ot_tags_from_script_and_language/LATIN none 4.32 ns 4.31 ns 162656688
After:
------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
------------------------------------------------------------------------------------------------
BM_hb_ot_tags_from_script_and_language/COMMON abcd_XY 8.27 ns 8.24 ns 81868701
BM_hb_ot_tags_from_script_and_language/COMMON zh_CN 56.1 ns 56.0 ns 12353284
BM_hb_ot_tags_from_script_and_language/COMMON en_US 24.3 ns 24.2 ns 28955030
BM_hb_ot_tags_from_script_and_language/LATIN en_US 24.5 ns 24.4 ns 28664868
BM_hb_ot_tags_from_script_and_language/COMMON none 4.35 ns 4.34 ns 161190014
BM_hb_ot_tags_from_script_and_language/LATIN none 4.36 ns 4.34 ns 161319000
2022-05-17 15:19:40 -06:00
Behdad Esfahbod
dde48d78c1
Fix compiler warning
2022-05-17 15:07:49 -06:00
Behdad Esfahbod
15be0deda0
[ot-tags] Optimize lang_matches()
...
Part of https://github.com/harfbuzz/harfbuzz/issues/3591
Before:
------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
------------------------------------------------------------------------------------------------
BM_hb_ot_tags_from_script_and_language/COMMON abcd_XY 8.67 ns 8.64 ns 80324382
BM_hb_ot_tags_from_script_and_language/COMMON zh_CN 91.2 ns 90.9 ns 7674131
BM_hb_ot_tags_from_script_and_language/COMMON en_US 41.1 ns 41.0 ns 17174093
BM_hb_ot_tags_from_script_and_language/LATIN en_US 41.3 ns 41.2 ns 17000876
BM_hb_ot_tags_from_script_and_language/COMMON none 4.56 ns 4.55 ns 153914130
BM_hb_ot_tags_from_script_and_language/LATIN none 4.53 ns 4.52 ns 153830303
After:
------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
------------------------------------------------------------------------------------------------
BM_hb_ot_tags_from_script_and_language/COMMON abcd_XY 8.24 ns 8.21 ns 84078465
BM_hb_ot_tags_from_script_and_language/COMMON zh_CN 77.5 ns 77.2 ns 9059230
BM_hb_ot_tags_from_script_and_language/COMMON en_US 38.8 ns 38.7 ns 17790692
BM_hb_ot_tags_from_script_and_language/LATIN en_US 37.6 ns 37.5 ns 18648293
BM_hb_ot_tags_from_script_and_language/COMMON none 4.50 ns 4.49 ns 155573267
BM_hb_ot_tags_from_script_and_language/LATIN none 4.49 ns 4.47 ns 156456653
2022-05-17 14:57:08 -06:00
Behdad Esfahbod
407a135baf
[perf/benchmark-ot] Add one more test
2022-05-17 14:45:45 -06:00
Behdad Esfahbod
dd3c858f84
[ot-tags] Speed up hb_ot_tags_from_language()
...
Part of https://github.com/harfbuzz/harfbuzz/issues/3591
"After that, bulk of the time I suppose is spent in binary-searching the
language table. I suggest we split the language table in 2-letter and
3-letter tags, to speed-up the vast majority of cases that are
2-letter."
benchmark-ot, before:
----------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------------------------------------------
BM_hb_ot_tags_from_script_and_language/COMMON zh_CN 112 ns 111 ns 6286271
BM_hb_ot_tags_from_script_and_language/COMMON en_US 60.6 ns 60.4 ns 11671176
BM_hb_ot_tags_from_script_and_language/LATIN en_US 61.3 ns 61.1 ns 11442645
BM_hb_ot_tags_from_script_and_language/COMMON none 4.75 ns 4.74 ns 146997235
BM_hb_ot_tags_from_script_and_language/LATIN none 4.65 ns 4.64 ns 150938747
After:
----------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------------------------------------------
BM_hb_ot_tags_from_script_and_language/COMMON zh_CN 89.5 ns 89.2 ns 7747649
BM_hb_ot_tags_from_script_and_language/COMMON en_US 38.5 ns 38.4 ns 18199432
BM_hb_ot_tags_from_script_and_language/LATIN en_US 39.0 ns 38.9 ns 18049238
BM_hb_ot_tags_from_script_and_language/COMMON none 4.53 ns 4.52 ns 154895110
BM_hb_ot_tags_from_script_and_language/LATIN none 4.54 ns 4.52 ns 154762105
2022-05-17 14:28:28 -06:00
Behdad Esfahbod
9baccb9860
[ot-tags] Speed up hb_ot_tags_from_complex_language()
...
Part of https://github.com/harfbuzz/harfbuzz/issues/3591
2. All the subtag_matches outside the switch match long strings (>= 6 or so).
As such, check the tag for such length before going into any of them.
benchmark-ot, before:
----------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------------------------------------------
BM_hb_ot_tags_from_script_and_language/COMMON zh_CN 172 ns 171 ns 4083155
BM_hb_ot_tags_from_script_and_language/COMMON en_US 120 ns 119 ns 5849947
BM_hb_ot_tags_from_script_and_language/LATIN en_US 113 ns 112 ns 5840326
BM_hb_ot_tags_from_script_and_language/COMMON none 4.66 ns 4.64 ns 151396224
BM_hb_ot_tags_from_script_and_language/LATIN none 4.66 ns 4.64 ns 149019593
After:
----------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------------------------------------------
BM_hb_ot_tags_from_script_and_language/COMMON zh_CN 112 ns 112 ns 6357763
BM_hb_ot_tags_from_script_and_language/COMMON en_US 60.5 ns 60.3 ns 11475091
BM_hb_ot_tags_from_script_and_language/LATIN en_US 54.9 ns 54.8 ns 12575690
BM_hb_ot_tags_from_script_and_language/COMMON none 4.61 ns 4.59 ns 152388450
BM_hb_ot_tags_from_script_and_language/LATIN none 4.66 ns 4.64 ns 151497600
2022-05-17 13:34:34 -06:00
Behdad Esfahbod
26d906b88b
[perf] Add benchmark-ot
2022-05-17 13:12:17 -06:00
Behdad Esfahbod
629fa8ee87
[perf/benchmark-font] Test Roboto as variable even though it's not
2022-05-16 17:49:36 -06:00
Behdad Esfahbod
71a0cda869
[perf/benchmark-font] Only certain fonts are variable
...
Don't test every font as variable.
2022-05-16 17:49:36 -06:00
Behdad Esfahbod
fb413f5202
[subset/cff] Don't use bitfields for hot bools
...
The struct has room because of alignment, and these bools are hot.
2022-05-16 17:38:18 -06:00
Behdad Esfahbod
a4d98b63ea
[subset/cff1] Collect glyph-to-sid map to avoid an O(n^2) algorithm
...
Saves 13 for largest benchmark:
BM_subset/subset_glyphs/SourceHanSans-Regular_subset.otf/10000 -0.1313 -0.1308 75 65 75 65
BM_subset/subset_codepoints/SourceHanSans-Regular_subset.otf/4096 -0.1009 -0.1004 54 48 54 48
BM_subset/subset_codepoints/SourceHanSans-Regular_subset.otf/10000 -0.1067 -0.1066 70 62 69 62
2022-05-16 17:38:18 -06:00
Behdad Esfahbod
b87f48e948
[cff1] get_sid() move bounds check into each implementation
2022-05-16 17:38:18 -06:00
Behdad Esfahbod
e1e359b4da
[cff1] Tighten up range_list_t a bit
2022-05-16 16:36:28 -06:00
Behdad Esfahbod
3fbac0942d
[cff1] Lazy-load & sort glyph names
...
Improves subset benchmarks by up to 70% for small CFF1 subset of
non-CID fonts!
BM_subset/subset_glyphs/SourceSansPro-Regular.otf/10 -0.7067 -0.7071 1 0 1 0
BM_subset/subset_glyphs/SourceSansPro-Regular.otf/64 -0.4817 -0.4824 1 0 1 0
BM_subset/subset_glyphs/SourceSansPro-Regular.otf/512 -0.1948 -0.1956 2 2 2 2
BM_subset/subset_glyphs/SourceSansPro-Regular.otf/2000 -0.0767 -0.0761 6 6 6 6
2022-05-16 16:36:28 -06:00
Behdad Esfahbod
b58bfd9818
[font] Minor move of code to silence gcc-12 warning
...
See mailing list discussion.
2022-05-16 11:21:45 -06:00
Behdad Esfahbod
602e0ca79d
[cff] Minor restructure of struct
...
Surprisingly this shows tiny benchmark improvement consistently.
2022-05-16 10:14:34 -06:00
Behdad Esfahbod
acdab17ed3
[cff] Cosmetic in parsed_values_t
2022-05-13 14:14:36 -06:00
Behdad Esfahbod
b46c7faa9c
[cff] Check buf_len, not buf
...
Ouch!
2022-05-13 14:02:54 -06:00
Garret Rieger
19a8db8545
[subset] fix potential integer overflow in gname_t::cmp.
2022-05-13 13:55:39 -06:00
Behdad Esfahbod
2d2f66e1a3
[cff-common] In INDEX, return empty bytes if length is zero
...
Before it was possible to return non-null arrayZ.
2022-05-13 13:53:17 -06:00
Behdad Esfahbod
a2f132f1fc
[cff] Check glyph-name's length, not arrayZ
...
As the latter can be non-null while still zero-length.
2022-05-13 13:49:39 -06:00
jeremiazhao
dc09053f19
fix build requirements for fedora/centos in buiding document
2022-05-13 13:10:11 -06:00
Thomas Devoogdt
c657c4e1f8
[meta] fix type traits on gcc 4.9 #3526
...
Signed-off-by: Thomas Devoogdt <thomas.devoogdt@barco.com>
2022-05-13 11:26:12 -06:00
Garret Rieger
e4e053c8b3
[perf] fix typo in perf Makefile.
2022-05-13 11:25:09 -06:00
Behdad Esfahbod
e61234c5f7
[vector] Add tests for move constructor/assignment
2022-05-12 13:20:10 -06:00
Behdad Esfahbod
7fa580bc4f
[map] Fix map copy/move constructors to actually work
...
Ouch!
2022-05-12 13:05:32 -06:00
Behdad Esfahbod
a09dd87ca3
[set] Fix set copy/move constructors to actually work
...
Ouch!
2022-05-12 12:58:07 -06:00
Behdad Esfahbod
76fc27713f
[vector] Remove explicit std::move
...
Was confusing compilers. Let them figure it out themselves.
Makes NotoNastaliqu subsetting/1000 benchmark more than twice faster:
Benchmark Time CPU Time Old Time New CPU Old CPU New
------------------------------------------------------------------------------------------------------------------------------------------------------------
BM_subset/subset_glyphs/NotoNastaliqUrdu-Regular.ttf/1000 -0.5064 -0.5065 111 55 110 55
BM_subset/subset_codepoints/NotoNastaliqUrdu-Regular.ttf/1000 -0.5494 -0.5493 132 59 131 59
2022-05-12 12:14:07 -06:00
Behdad Esfahbod
c81198b5bc
[set] Tweak move operators a bit
...
Should be equivalent.
2022-05-12 12:14:02 -06:00
Behdad Esfahbod
8dc072d20d
Merge pull request #3579 from harfbuzz/subset-retain-buffer
...
Subset retain buffer
2022-05-11 16:45:40 -06:00
Behdad Esfahbod
175319cd89
[gsubgpos] Clean up OT::ClassDefFormat2::intersected_class_glyphs 0 case
2022-05-11 13:47:17 -06:00
Behdad Esfahbod
137af3612b
[gsubgpos] Simplify OT::ClassDefFormat2::intersected_class_glyphs()
2022-05-11 13:39:30 -06:00
Behdad Esfahbod
3261e05bdb
[subset] Optimize ClassDef1::intersected_class_glyphs() for class0
2022-05-11 13:16:31 -06:00
Behdad Esfahbod
c78d8ba60b
[subset] Allocate same size as source table for GSUB/GPOS/name
2022-05-11 13:05:41 -06:00
Behdad Esfahbod
2e7f1ae48f
[subset] Use vector.allocated size instead of tracking buf_size
2022-05-11 12:52:27 -06:00
Behdad Esfahbod
f08537963b
[cff-subset] Pre-alloc vector for operator decoding
2022-05-11 12:14:49 -06:00