Commit Graph

11803 Commits

Author SHA1 Message Date
Behdad Esfahbod 77e704d1db [buffer] Add assert_unicode()/assert_glyphs() and use internally 2020-10-15 02:02:04 -06:00
Behdad Esfahbod 5ef0613909 [buffer] Add ensure_glyphs()/ensure_unicode()
Use in deserialize. To be used more.
2020-10-15 01:54:28 -06:00
Khaled Hosny 84dd65a874 [test] Remove timeout from test runners
See https://github.com/harfbuzz/harfbuzz/issues/2707#issuecomment-707744079

This wasn’t inconsistent as well, HB_TEST_SUBSET_FUZZER_TIMEOUT defaulted
to 12 in the test runner, but it was overridden to 50 in meson.build,
and then meson has its own test timeout.
2020-10-15 00:49:02 -07:00
Behdad Esfahbod 3232e6f2a9 [buffer] Add hb_buffer_has_positions()
Fixes https://github.com/harfbuzz/harfbuzz/issues/2716
2020-10-15 00:20:17 -06:00
Khaled Hosny 97a093c52f [hb-subset] Improve error handling a bit
* Check that output-file option is actually set before trying to open
  it.
* Print file name and errno when opening the output file fails.
* Be more resilient when writing output file and use ferror() to check
  for errors.

Fixes https://github.com/harfbuzz/harfbuzz/issues/2711
2020-10-13 11:18:59 -07:00
Khaled Hosny fa771a7f85 [tests] Fix memory leak in test
To make valgrind bot happy.
2020-10-11 13:15:39 -07:00
David Corbett dec52006d9 Map BCP 47 tags to all macrolanguages
The general rule is that if a BCP 47 macrolanguage maps to an OpenType
language system tag, all its individual languages map to it too.
Previously, a tag like "prs" (Dari) would not map to the language system
tag ('FAR ') of its macrolanguage ("fa") because "prs" already has its
own language system tag ('DRI '). That exception has been removed: now
"prs" maps to 'DRI ' and falls back to 'FAR '.
2020-10-11 11:38:40 -07:00
David Corbett 1d53268dfe Fix two-way mapping of "man" and 'MNK ' 2020-10-11 11:38:40 -07:00
David Corbett ab38cf6746 Map hy-arevmda to 'HYE ' instead of HYE0 2020-10-11 11:38:40 -07:00
David Corbett 916c5a9007 Consistently emit BCP 47 subtag scope suffixes 2020-10-11 11:38:40 -07:00
Behdad Esfahbod 1c05f6789b [buffer] Increase work limits
Our previous limits of 64 per input character was already hit
by David Corbett's under-development Duployan font.

Increase work limits by factor of 16, and number of glyphs by factor of 2.

Fixes https://github.com/harfbuzz/harfbuzz/issues/2707
2020-10-11 12:28:25 -06:00
Behdad Esfahbod b37edebfcb [buffer/deserialize] Do not clear() buffer upon content type mismatch
We return false. I don't see reason to clear buffer.
2020-10-09 22:27:56 -06:00
Behdad Esfahbod c396e1600f [buffer/deserialize] Accept arbitrary glyph names
Accepts escapes. Added TODO items for matching escaping in serialize().
2020-10-09 22:27:56 -06:00
Behdad Esfahbod 4a4eebcf86 [buffer/serialize] Minor renames in Ragel machines
As per my previous review on:
https://github.com/harfbuzz/harfbuzz/pull/2687
2020-10-09 22:27:56 -06:00
Behdad Esfahbod 540d2cdddb [tests/buffer] Revert unintended whitespace changes
From 9e5538d6a3

Tried squashing into, but too much merge conflict.
2020-10-09 22:27:55 -06:00
Behdad Esfahbod 78fb6a11af Whitespace 2020-10-09 22:27:55 -06:00
Behdad Esfahbod 140552cec9 [buffer/serialize] Only serialize empty buffers of CONTENT_TYPE_INVALID 2020-10-09 22:27:55 -06:00
Behdad Esfahbod 04658ec48f [tests/buffer] Update tests for previous commit 2020-10-09 22:27:55 -06:00
Behdad Esfahbod 8f5d8b155c [buffer] Buffer start <= end <= len requirement in (de-)serialize 2020-10-09 22:27:55 -06:00
Behdad Esfahbod 3b64122a7f [buffer] Fix immutable case with end_ptr==nullptr 2020-10-09 22:27:55 -06:00
Simon Cozens 7c0bc0bb92 Serialize invalid buffer to !! (text) or [] (json)
There is no generic deserialize - you have to choose glyphs or unicode - so there is no way to deserialize this buffer.
2020-10-09 22:27:55 -06:00
Simon Cozens 5bb88c4f45 Oops debug print 2020-10-09 22:27:55 -06:00
Simon Cozens f56eb402f0 Immutable buffer fix 2020-10-09 22:27:55 -06:00
Simon Cozens 150f391438 Prohibit mixed glyphs/unicode buffers in deserialization 2020-10-09 22:27:55 -06:00
Simon Cozens 6b1726b6ef Typos 2020-10-09 22:27:55 -06:00
Simon Cozens 3d3c87e7e7 Put the flags back in and serialize clusters.
Note that now JSON glyph buffers and Unicode buffers look very similar, except for the g/u property difference.
2020-10-09 22:27:55 -06:00
Simon Cozens 432a05b2af (Simple) tests for Unicode serialization/deserialization 2020-10-09 22:27:55 -06:00
Simon Cozens c03a2001b2 Deserialization routines for Unicode buffers 2020-10-09 22:27:55 -06:00
Simon Cozens c0716bb5dc Move delimiter addition into hb-buffer-serialize 2020-10-09 22:27:55 -06:00
Simon Cozens 36ede56962 Fix docs
Note the delimiters stuff isn’t true yet, will be working on that
2020-10-09 22:27:55 -06:00
Simon Cozens bb7b634cd0 Simplify JSON unicode serialization
It’s just an array of codepoints; no need to turn them into objects
2020-10-09 22:27:55 -06:00
Simon Cozens 57a528ab2c Convert tabs to spaces 2020-10-09 22:27:55 -06:00
Simon Cozens aff6a36266 Use auxbuffer for serialize_unicode_text 2020-10-09 22:27:55 -06:00
Simon Cozens a0203a28bb Use hb_buffer_serialize to trace in utils 2020-10-09 22:27:55 -06:00
Simon Cozens 58bcc1cedd Serialize Unicode buffers 2020-10-09 22:27:55 -06:00
Garret Rieger be33704c00 Add gpos 5 tests to meson build file. 2020-10-09 16:46:46 -07:00
David Corbett c39ab82c90 Fix usage text of gen-use-table.py 2020-10-06 16:51:40 -04:00
Garret Rieger aace09a3ad [subset] Use glyphset gsub for layout variation indices collection. 2020-10-06 10:26:17 -07:00
Garret Rieger 1d9801e012 [subset] In AnchorMatrix::subset eliminate the use of dynamically allocated vector. 2020-10-05 14:43:29 -07:00
Garret Rieger 093909b2ff [subset] Fix wrong offset base for subsetting LigatureArray.
Offsets from LigatureArray must be relative to the beginning of the LigatureArray table. For the serialization mechanism to use the correct beginning point the LigatureArray must be created using the push()/pop() mechanism. So convert LigatureArray subsetting to use serialize_subset() instead of a manually called serialize and subset.
2020-10-05 13:14:53 -07:00
Garret Rieger 147e93b910 [subset] Fixes to get GPOS 5 subsetting code compiling. 2020-10-01 16:45:57 -07:00
Qunxin Liu 3a0b05faf1 [subset] GPOS 5 MarkToLigature subsetting support 2020-10-01 15:59:16 -07:00
Garret Rieger 718bf5aab3 [subset] only keep features reachable from script in the final subset.
Matches fontTools behaviour.
2020-09-29 13:16:01 -07:00
Garret Rieger e583505334 [subset] Use plan->glyphset_gsub instead of plan->glyphset for GSUB/GPOS
This matches fontTools behaviour. glyphset_gsub does not contain gids added from closing over composite glyphs in glyf, since these cannot particpate in GSUB/GPOS processing.
2020-09-29 11:16:15 -07:00
David Corbett a99e8721bf [use] Fix tests with MSVC 2020-09-29 09:54:33 -04:00
Garret Rieger 010accb3d5 [subset] Add additional test cases for the Amiri tests. 2020-09-28 17:39:09 -07:00
Garret Rieger 940e1c6f98 [subset] ChainContextFormat3 - don't subset glyph sequences.
The backtrack, input, and lookahead sequence must be matched in their entirety so these sequences should not be subset. If any of the coverage tables in a sequence subsets to empty then the whole subtable should be dropped since it's not possible for this lookup to be activated.
2020-09-28 17:22:01 -07:00
Garret Rieger e31c2690f8 [subset] remove unnecessary returns. 2020-09-28 16:51:25 -07:00
Garret Rieger 3271a7cdaa [subset] Remove redundant langys from Amiri test font.
FontTools removes these when subsetting but harfbuzz does not yet support redundant langsys removal. So this gets the Amiri tests passing for now.
2020-09-28 16:46:15 -07:00
Garret Rieger ad241f9917 [subset] check that sub rules in ChainContextFormat 1 and 2 intersect the glyphs set before recursing during closure lookups. 2020-09-28 15:26:13 -07:00