Garret Rieger
8f9f0c494b
[subset] Enforce cmap12 group ordering constraints in collect_mapping.
...
Fixes fuzzer issue: https://oss-fuzz.com/testcase-detail/6365271012540416
2022-05-10 12:15:09 -06:00
Behdad Esfahbod
f10ddb8dd8
[cmap] Use -1 as Unicode sentinel, not U+FFFF in Format12 serialize
2022-05-05 11:21:24 -06:00
Behdad Esfahbod
8a19968c8b
[cmap] Use iterator bool operator
2022-05-05 11:17:23 -06:00
Behdad Esfahbod
052812b6ba
Merge pull request #3561 from googlefonts/cmap_opt
...
[subset] Further cmap subsetting speed optimizations
2022-05-04 15:38:30 -06:00
Garret Rieger
f0c04114bc
[subset] Embed unicode to gid list vector in subset plan.
2022-05-03 22:02:59 +00:00
Behdad Esfahbod
3fff2e9182
[perf/benchmark-font] Cosmetic
2022-05-02 16:42:10 -06:00
Behdad Esfahbod
307d2d8bb6
[cmap] Sprinkle some 'unlikely's
2022-05-02 16:30:22 -06:00
Garret Rieger
088133d939
[subset] cache cp to new gid list in subset plan.
...
This avoids having to recompute the ordered list multiple times during cmap generation.
2022-05-02 21:29:16 +00:00
Garret Rieger
6922a2561f
[subset] Change serialize_rangeoffset_glyid back to using iterator.
2022-04-29 23:30:32 +00:00
Garret Rieger
c66fd50c26
[subset] in cmap4 serialization save cp to gid iter to memory.
...
Iterator accesses are slow and it's iterated multiple times.
2022-04-29 23:18:53 +00:00
Garret Rieger
17b98563dc
[subset] In cmap4 serialization reduce unnessecary calls into the iterator.
...
Gives ~20% speedup for large subsets.
2022-04-29 22:49:02 +00:00
Garret Rieger
5e241094bf
[subset] In unicodes cache cleanup if set insert fails.
2022-04-29 22:45:16 +00:00
Garret Rieger
a424a92ce5
[subset] s/void */intptr_t.
2022-04-29 22:14:03 +00:00
Garret Rieger
aad67f5629
[subset] cache results of collect_unicodes.
2022-04-29 22:05:34 +00:00
Garret Rieger
b4236b7de6
[subset] Optimize Cmap4 collect_unicodes.
...
Use set add_range() instead of individual add() calls.
2022-04-29 19:22:00 +00:00
Behdad Esfahbod
f41945e313
[cmap] In collect_unicodes() of format 12/13, limit to max Unicode
...
Fixes fuzzer timeout:
https://oss-fuzz.com/testcase-detail/5062368881672192
2022-03-21 18:24:30 -06:00
Behdad Esfahbod
ac1bb3e39e
[machinery] Move accelerators to constructor/destructor
2022-01-20 12:10:05 -07:00
Behdad Esfahbod
e062376ef1
[machinery] Make accelerator lazy-loader call Xinit/Xfini
...
Instead of init/fini. To isolate those functions. To be turned into
constructor/destructors, ideally one per commit (after some SFINAE
foo.)
2022-01-19 17:09:34 -07:00
Behdad Esfahbod
8a69e00639
[meta] Use std::addressof() instead of hb_addressof()
2022-01-13 16:17:34 -07:00
Garret Rieger
1d9ef3a75a
[subset] Actually fix end_cp unitialized warning.
2021-12-01 11:53:10 -07:00
Garret Rieger
d8635dfe5a
[subset] Fix warning about uninitialized use of end_cp.
2021-12-01 11:17:46 -07:00
Garret Rieger
95329081c2
[subset] further optimize cmap4 packing.
2021-11-28 07:47:49 -07:00
Garret Rieger
d9660fd58a
[subset] Make cmap4 packing more optimal.
...
The current CMAP4 implementation uses whatever the current codepoint ranges are and then encodes them as indivudal glyph ids or as a delta if possible. However, it's often possible to save bytes by splitting up existing ranges and encoding parts of them using deltas where the cost of splitting the range is less than encoding each glyph individual.
2021-11-26 13:21:50 -07:00
Behdad Esfahbod
c852b86841
Rename HBGlyphID to HBGlyphID16
2021-09-19 16:30:12 -04:00
Garret Rieger
2bd911b8b4
[subset] handle cmap4 overflows.
...
If a cmap4 subtable overflows during serialization drop it and the corresponding EncodingRecord. Don't drop the corresponding cmap12 table if it would have otherwise been removed.
2021-09-02 14:43:17 -06:00
Garret Rieger
b9a176e268
[subset] speedup cmap4 subsetting for large codepoint counts. ( #3178 )
...
glyphIdArray generation implementation was O(n^2). Refactored to use a hashmap to reduce complexity. After the change subset time for a 22k codepoint subset went from 7s to 0.7s.
2021-08-29 10:33:12 -06:00
Garret Rieger
2c024dc3cb
[subset] prune redundant cmap12 subtables.
...
If the post subset cmap12 table is equivalent to another cmap subtable don't include the 12 table in the final subset. Matches change https://github.com/fonttools/fonttools/pull/2146 from fontTools.
2021-08-04 17:36:24 -06:00
Behdad Esfahbod
f0a1892ff9
[serialize] Remove unnecessary pointer indirection
2021-07-28 17:36:22 -06:00
Garret Rieger
9aa0ecef3f
[subset] de-duplicate the logic that finds unicodes corresponding to requested glyphs.
...
Move the logic into subset planning and then re-use the results in cmap and OS2 subsetting. Removes depedency on cmap from os2.
2021-07-14 17:31:47 -07:00
Behdad Esfahbod
092094f705
Use as_array() and range loops in a few places
2021-04-01 16:02:54 -06:00
Behdad Esfahbod
4dba749d83
Add SortedArray{16,32}Of<>
2021-03-31 16:09:39 -06:00
Behdad Esfahbod
ad28f973f3
Rename offset types to be explicit about their size
...
Add Offset16To<>, Offset24To<>, and Offset32To<> for most use-cases.
2021-03-31 13:00:07 -06:00
Garret Rieger
b14475d2ae
[subset] further changes to serializer error handling.
...
- Rename enum type and enum members.
- in_errors() now returns true for any error having been set. hb-subset now looks for offset overflow only errors to divert to repacker.
- Added INT_OVERFLOW and ARRAY_OVERFLOW enum values.
2021-03-18 10:51:26 -07:00
Garret Rieger
73ed59f7a6
[subset] store errors in the serializer as a flag set.
...
Make check_assign/check_equal specify the type of error to set.
2021-03-17 15:58:34 -07:00
Behdad Esfahbod
6d94194497
Use auto in range-for-loop more
2021-02-19 17:10:06 -07:00
Garret Rieger
18ab8029d5
[ENOMEM] check vector status in cmap subsetting.
2020-08-02 00:30:17 +04:30
Ebrahim Byagowi
5a7cc7fd8b
minor spacing tweak
2020-07-29 08:33:38 +04:30
Ebrahim Byagowi
d0e2addd43
minor
2020-07-18 22:16:02 +04:30
Qunxin Liu
8e5bc535d1
[subset] call collect_mapping only when --gids option is used.
...
collect_mapping is time consuming as it iterates all codepoints in all
cmap subtables, only trigger it when necessary
2020-07-16 11:25:53 -07:00
Qunxin Liu
10d6605bbe
[subset] don't use << operator in collect_mapping
2020-05-15 11:04:59 -07:00
Qunxin Liu
b2a965df5e
[subset] Add support for "--gids" option
...
cmap subsetting now retains entries associated with any glyph ids explicitly requested
2020-05-11 15:28:58 -07:00
Qunxin Liu
e53c44e326
[subset] temporarily revert previous cmap commit
...
Required in https://github.com/harfbuzz/harfbuzz/issues/2356
2020-04-25 12:21:22 +04:30
Ebrahim Byagowi
08428a15c3
minor, spacing
2020-04-24 23:45:17 +04:30
Ebrahim Byagowi
2dda6dd744
minor, tweak spacing
...
turn 8 spaces to tab, add space before Null/Crap
2020-04-20 16:18:29 +04:30
Ebrahim Byagowi
a224f4179f
Turn more of simple dagger chains to foreach
...
Less noise, as was agreed before and applied 385741d
also
2020-03-13 08:33:34 +03:30
Ebrahim Byagowi
07acd1a042
[subset] Rename src_base args to base to match sanitize methods
...
So it will become easier to follow that serialize methods signatures should
match with their sanitize methods counterparts.
2020-03-08 23:39:26 +03:30
ariza
188a0a47c2
removed default base; replaced w/ bias if required
2020-03-08 22:59:43 +03:30
Michiharu Ariza
5ab50eebd7
collect_unicodes() with clamp, calling add_range()
...
Use add_range instead an inner loop, clamp its input number by
number of glyphs a face has.
Even the face cmap12 and 13 have 32-bit hb_codepoint_t, which is here
used to make timeout, face's maxp has 16-bit gid limitation at least for now,
using that makes sure we both fix and the timeout and don't need to change
much things here also in order to support 32-bit gids also someday.
Fixes #2204
2020-02-29 13:02:29 +03:30
Ebrahim Byagowi
e90213868b
Revert "collect_unicodes() to check gid < num_glyphs with cmap 12"
...
Didn't fix the case actually, making bots to fail.
This reverts commit 15b43a4104
.
2020-02-28 21:24:51 +03:30
Michiharu Ariza
15b43a4104
collect_unicodes() to check gid < num_glyphs with cmap 12
...
fixes #2204
2020-02-28 20:15:39 +03:30