Garret Rieger
d9660fd58a
[subset] Make cmap4 packing more optimal.
...
The current CMAP4 implementation uses whatever the current codepoint ranges are and then encodes them as indivudal glyph ids or as a delta if possible. However, it's often possible to save bytes by splitting up existing ranges and encoding parts of them using deltas where the cost of splitting the range is less than encoding each glyph individual.
2021-11-26 13:21:50 -07:00
Behdad Esfahbod
c852b86841
Rename HBGlyphID to HBGlyphID16
2021-09-19 16:30:12 -04:00
Garret Rieger
2bd911b8b4
[subset] handle cmap4 overflows.
...
If a cmap4 subtable overflows during serialization drop it and the corresponding EncodingRecord. Don't drop the corresponding cmap12 table if it would have otherwise been removed.
2021-09-02 14:43:17 -06:00
Garret Rieger
b9a176e268
[subset] speedup cmap4 subsetting for large codepoint counts. ( #3178 )
...
glyphIdArray generation implementation was O(n^2). Refactored to use a hashmap to reduce complexity. After the change subset time for a 22k codepoint subset went from 7s to 0.7s.
2021-08-29 10:33:12 -06:00
Garret Rieger
2c024dc3cb
[subset] prune redundant cmap12 subtables.
...
If the post subset cmap12 table is equivalent to another cmap subtable don't include the 12 table in the final subset. Matches change https://github.com/fonttools/fonttools/pull/2146 from fontTools.
2021-08-04 17:36:24 -06:00
Behdad Esfahbod
f0a1892ff9
[serialize] Remove unnecessary pointer indirection
2021-07-28 17:36:22 -06:00
Garret Rieger
9aa0ecef3f
[subset] de-duplicate the logic that finds unicodes corresponding to requested glyphs.
...
Move the logic into subset planning and then re-use the results in cmap and OS2 subsetting. Removes depedency on cmap from os2.
2021-07-14 17:31:47 -07:00
Behdad Esfahbod
092094f705
Use as_array() and range loops in a few places
2021-04-01 16:02:54 -06:00
Behdad Esfahbod
4dba749d83
Add SortedArray{16,32}Of<>
2021-03-31 16:09:39 -06:00
Behdad Esfahbod
ad28f973f3
Rename offset types to be explicit about their size
...
Add Offset16To<>, Offset24To<>, and Offset32To<> for most use-cases.
2021-03-31 13:00:07 -06:00
Garret Rieger
b14475d2ae
[subset] further changes to serializer error handling.
...
- Rename enum type and enum members.
- in_errors() now returns true for any error having been set. hb-subset now looks for offset overflow only errors to divert to repacker.
- Added INT_OVERFLOW and ARRAY_OVERFLOW enum values.
2021-03-18 10:51:26 -07:00
Garret Rieger
73ed59f7a6
[subset] store errors in the serializer as a flag set.
...
Make check_assign/check_equal specify the type of error to set.
2021-03-17 15:58:34 -07:00
Behdad Esfahbod
6d94194497
Use auto in range-for-loop more
2021-02-19 17:10:06 -07:00
Garret Rieger
18ab8029d5
[ENOMEM] check vector status in cmap subsetting.
2020-08-02 00:30:17 +04:30
Ebrahim Byagowi
5a7cc7fd8b
minor spacing tweak
2020-07-29 08:33:38 +04:30
Ebrahim Byagowi
d0e2addd43
minor
2020-07-18 22:16:02 +04:30
Qunxin Liu
8e5bc535d1
[subset] call collect_mapping only when --gids option is used.
...
collect_mapping is time consuming as it iterates all codepoints in all
cmap subtables, only trigger it when necessary
2020-07-16 11:25:53 -07:00
Qunxin Liu
10d6605bbe
[subset] don't use << operator in collect_mapping
2020-05-15 11:04:59 -07:00
Qunxin Liu
b2a965df5e
[subset] Add support for "--gids" option
...
cmap subsetting now retains entries associated with any glyph ids explicitly requested
2020-05-11 15:28:58 -07:00
Qunxin Liu
e53c44e326
[subset] temporarily revert previous cmap commit
...
Required in https://github.com/harfbuzz/harfbuzz/issues/2356
2020-04-25 12:21:22 +04:30
Ebrahim Byagowi
08428a15c3
minor, spacing
2020-04-24 23:45:17 +04:30
Ebrahim Byagowi
2dda6dd744
minor, tweak spacing
...
turn 8 spaces to tab, add space before Null/Crap
2020-04-20 16:18:29 +04:30
Ebrahim Byagowi
a224f4179f
Turn more of simple dagger chains to foreach
...
Less noise, as was agreed before and applied 385741d
also
2020-03-13 08:33:34 +03:30
Ebrahim Byagowi
07acd1a042
[subset] Rename src_base args to base to match sanitize methods
...
So it will become easier to follow that serialize methods signatures should
match with their sanitize methods counterparts.
2020-03-08 23:39:26 +03:30
ariza
188a0a47c2
removed default base; replaced w/ bias if required
2020-03-08 22:59:43 +03:30
Michiharu Ariza
5ab50eebd7
collect_unicodes() with clamp, calling add_range()
...
Use add_range instead an inner loop, clamp its input number by
number of glyphs a face has.
Even the face cmap12 and 13 have 32-bit hb_codepoint_t, which is here
used to make timeout, face's maxp has 16-bit gid limitation at least for now,
using that makes sure we both fix and the timeout and don't need to change
much things here also in order to support 32-bit gids also someday.
Fixes #2204
2020-02-29 13:02:29 +03:30
Ebrahim Byagowi
e90213868b
Revert "collect_unicodes() to check gid < num_glyphs with cmap 12"
...
Didn't fix the case actually, making bots to fail.
This reverts commit 15b43a4104
.
2020-02-28 21:24:51 +03:30
Michiharu Ariza
15b43a4104
collect_unicodes() to check gid < num_glyphs with cmap 12
...
fixes #2204
2020-02-28 20:15:39 +03:30
Garret Rieger
50129b03a1
Add a reverse () call to hb_array_t.
2020-02-26 11:09:54 -08:00
Garret Rieger
38c6598c1c
Switch to C style comments.
2020-02-26 11:09:54 -08:00
Garret Rieger
52b6e0baa0
When serializing cmap14 order the offsets from smallest to largest.
...
Current versions of OTS fail fonts with cmap 14's who's last offset does not point to the a block at the end of the table.
2020-02-26 11:09:54 -08:00
ckitagawa
03f778cf3c
[cmap] remove dead code
2020-02-05 18:00:39 +03:30
Ebrahim Byagowi
a7f694d4b0
Merge branch 'subset_cblc' into master
2020-02-05 16:31:21 +03:30
ckitagawa
e128f80278
parent 777ba47b50
...
author ckitagawa <ckitagawa@chromium.org> 1579631743 -0500
committer ckitagawa <ckitagawa@chromium.org> 1580506176 -0500
[subset] Add CBLC support
2020-01-31 16:37:30 -05:00
Qunxin Liu
b6a8f5e63c
[subset] CMAP table subsetting fix
...
Not all codepoints smaller than 0xFFFF go to cmap4 table.
Only subset codepoints existing in each table.
This will also make harfbuzz consistent with fontTools' behavior
2020-01-31 10:49:44 -08:00
Qunxin Liu
c370da45ff
[subset] Cmap table: remove encodingRecord entry for empty cmap4 subtable
2020-01-23 17:23:55 -08:00
Qunxin Liu
1db2c1d0da
fix for cmap4 and OS_2 subsetting: maximum character code allowed is 0xFFFF
2020-01-09 10:00:32 -08:00
Behdad Esfahbod
6a60ca117c
[algs] Fold last other bsearch() in
...
Now truly have only one bsearch implementation.
2019-12-10 12:32:59 -06:00
Ebrahim Byagowi
486754a888
[serialize] Extract iterable copy, copy_all
2019-10-31 13:31:11 -07:00
Khaled Hosny
dd288840d6
[cmap] Check GID before adding ranges in format 4 & 12
...
Fixes https://github.com/harfbuzz/harfbuzz/issues/2031
2019-10-29 02:09:13 +02:00
Behdad Esfahbod
03028a5fe5
Revert "Don't include codepoint 0 in the results of collect_unicodes."
...
This reverts commit 14ad96ffbf
.
This was wrong. My bad!
https://github.com/harfbuzz/harfbuzz/issues/2031
2019-10-28 13:46:56 -07:00
Garret Rieger
14ad96ffbf
Don't include codepoint 0 in the results of collect_unicodes.
...
It is always assumed to be the notdef glyph.
2019-10-28 12:56:04 -07:00
Ebrahim Byagowi
0558413f27
Minor, tweak spaces
2019-10-01 13:50:11 +03:30
Ebrahim Byagowi
035ec3d1b4
[cmap] remove has_format14, minor format
...
fixes #1986
2019-09-23 20:51:43 +03:30
Ebrahim Byagowi
385741d565
[cmap] Turn hb_apply into foreach where possible
2019-09-21 15:33:02 +04:30
Ebrahim Byagowi
1023c2cc6d
[cmap] minor
2019-09-21 15:33:02 +04:30
Ebrahim Byagowi
ead46eefe3
minor, use internal API instead public hb_set_has
2019-09-21 15:33:02 +04:30
Ebrahim Byagowi
d8af4e7701
[cmap] minor, turn 8 spaces to tab
2019-09-21 15:33:02 +04:30
Qunxin Liu
4315666283
[subset] updates according to review comments
2019-09-20 07:55:11 +09:00
Qunxin Liu
2583afa0eb
[subset] subsetting cmap14
2019-09-20 07:55:11 +09:00