Commit Graph

5459 Commits

Author SHA1 Message Date
Behdad Esfahbod bb35725cd7 [kerx/morx] More end-of-text protection 2018-10-15 11:05:10 -07:00
Ebrahim Byagowi 8f3048a1f8
[dump-emoji] minor 2018-10-15 12:16:47 +03:30
Ebrahim Byagowi 27e095a613 [dump-emoji] better explaination of the usage 2018-10-15 01:41:49 -07:00
Behdad Esfahbod 8dc6296818 [ot-font] Implement TrueType v_origin
Fixes https://github.com/harfbuzz/harfbuzz/issues/537
2018-10-15 01:09:05 -07:00
Behdad Esfahbod 6e07076fd0 [blob] Fix UBSan error 2018-10-14 22:22:45 -07:00
Behdad Esfahbod fc812faaa9 [CBDT] Fix more offsetting issues
Fixes https://github.com/harfbuzz/harfbuzz/issues/960

dump-emoji still segfaults.  Needs debugging.
2018-10-14 21:32:25 -07:00
Behdad Esfahbod 6aee3bb87c [CBDT] Fix offset handling
Fixes https://github.com/harfbuzz/harfbuzz/issues/960
2018-10-14 21:08:42 -07:00
Behdad Esfahbod da744c6b3e [CBDT] More UnsizedArrayOf cleanup 2018-10-14 20:51:45 -07:00
Behdad Esfahbod 2995b4465b [CBDT] Simplify sanitize 2018-10-14 20:37:57 -07:00
Behdad Esfahbod 1c76c8f6ff [morx] Handle end-of-text conditions in Insertion
Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=10955
2018-10-14 19:39:31 -07:00
Behdad Esfahbod 60c1397673 [buffer] Fix output_glyph at end of buffer
Part of https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=10955
2018-10-14 19:38:14 -07:00
Behdad Esfahbod 7efa38257b [aat] More protection against buffer fail 2018-10-14 19:30:44 -07:00
Behdad Esfahbod e1add2a275 [hmtx] Whitespace 2018-10-14 16:26:03 -07:00
Behdad Esfahbod 62376a7d98 Ignore signed-integer-overflow while kerning
Fixes https://github.com/harfbuzz/harfbuzz/issues/1247
2018-10-14 15:20:50 -07:00
Behdad Esfahbod 40f2b9355c [kerx] Fix Format1 sanitize
Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=10948
2018-10-14 14:56:32 -07:00
Behdad Esfahbod 44af1f93ee [aat] Whitespace 2018-10-14 14:52:17 -07:00
Behdad Esfahbod 56b8dd17f6 [aat] Finish off massaging table 2018-10-13 19:03:33 -04:00
Behdad Esfahbod e0c5e0d91b [aat] WIP remove feature mapping here from hb-coretext
Need to map enum values to numerics since we don't have CoreText headers.
2018-10-13 18:46:52 -04:00
Behdad Esfahbod cb05774913 [coretext] Prepare AAT feature mapping to be moved 2018-10-13 17:03:32 -04:00
Behdad Esfahbod de6e414c56 [kerx] Sanitize more 2018-10-13 13:48:22 -04:00
Behdad Esfahbod 71f76f2f39 [kerx] Fix-up previous commit
A "&" was missing.  Go back to using pointers that are less error-prone.
2018-10-13 13:36:27 -04:00
Behdad Esfahbod 6d4b054234 [kerx] Use sanitizer instead of handcoded runtime sanitization 2018-10-13 12:20:33 -04:00
Behdad Esfahbod 5733113662 [kerx] Wire up context down to get_kerning 2018-10-13 12:16:12 -04:00
Behdad Esfahbod c4502833b7 [kerx] Use sanitizer.get_num_glyphs() instead of face->get_num_glyphs() 2018-10-13 12:09:59 -04:00
Behdad Esfahbod fc45e698f2 [kerx] Protext against overflows 2018-10-13 12:09:59 -04:00
Behdad Esfahbod ed2ee78136 [hangul] Fix use-after-free issue
out_info might have moved since we copied it's position into local
info var.

Fixes https://bugs.chromium.org/p/chromium/issues/detail?id=894937
2018-10-13 12:09:59 -04:00
Ebrahim Byagowi 63109432cf Cosmetic and minor changes 2018-10-13 07:23:33 -04:00
Behdad Esfahbod c0a6814b49 Touch up new API
New API:
+hb_ot_layout_feature_get_name_ids()
+hb_ot_layout_feature_get_characters()
2018-10-12 16:06:39 -04:00
Behdad Esfahbod 477bc9aafe Add hb-ot-name.h
Actual name-fetching API to come later.

New API:
hb_name_id_t
HB_NAME_ID_INVALID
2018-10-12 16:06:39 -04:00
Ebrahim Byagowi dc49bd8d81 Add two APIs for getting stylistic set labels
* hb_ot_layout_feature_get_characters
* hb_ot_layout_feature_get_name_ids

However HarfBuzz currently doesn't expose an API for retrieving the actual
information associated with NameId from the `name` table and that should be
done separately.
2018-10-12 16:06:39 -04:00
Behdad Esfahbod e9f9c0d81c [sanitize] Reorder condition to silence bogus gcc warning
Was givin a dozen of:

../../src/hb-machinery.hh: In member function ‘bool AAT::ankr::sanitize(hb_sanitize_context_t*) const’:
../../src/hb-machinery.hh:307:23: warning: missed loop optimization, the loop counter may overflow [-Wunsafe-loop-optimizations]
     bool ok = --this->max_ops > 0 &&
               ~~~~~~~~~~~~~~~~~~~~~~
        this->start <= p &&
        ~~~~~~~~~~~~~~~~~~~
        p <= this->end &&
        ~~~~~~~~~~~~~~~^~
        (unsigned int) (this->end - p) >= len;
        ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I believe those are bogus, but this silences them and does not introduce
logic issues I believe.
2018-10-12 16:06:39 -04:00
Behdad Esfahbod 1a6b5ac6c3 Add HB_DEPRECATED_FOR and mark relevant symbols 2018-10-12 16:06:39 -04:00
Behdad Esfahbod c9413d7bb5 [graphite] Add HB_DEPRECATED annotation 2018-10-12 16:06:39 -04:00
Behdad Esfahbod 68c86af187 Always compile deprecated symbols
We haven't been keeping this updated.  So, while we don't expose the
symbols in the headers if HB_DISABLE_DEPRECATED is defined, we still
always build them.
2018-10-12 16:06:39 -04:00
David Corbett c55100000b Add missing colons to GObject annotations 2018-10-11 22:47:35 -04:00
David Corbett 1e816d62ef Fix Indic script tags in Graphite 2018-10-11 20:51:08 -04:00
Behdad Esfahbod bf8469be9a Attach CursivePositioning backwards, not forward
This is how Uniscribe does it.  So, adjust.  This is only relevant
to fonts that apply cursive positioning from a contextual lookup.

Fixes https://github.com/harfbuzz/harfbuzz/issues/1181
2018-10-11 20:45:40 -04:00
Behdad Esfahbod bdb53ca24a [myanmar] Implement Zawgyi shaper
Enabled if script tag 'Qaag' is passed to HarfBuzz.  Disables mark
advance-zeroing and fallback mark-positioning.

Fixes https://github.com/harfbuzz/harfbuzz/issues/1162
2018-10-11 20:20:29 -04:00
Behdad Esfahbod 00c5c4a79d [myanmar] Shuffle 2018-10-11 20:15:31 -04:00
Behdad Esfahbod ec8f493bf9 [graphite] Remove assert 2018-10-11 20:15:00 -04:00
Behdad Esfahbod 5646dcbd11 Minor 2018-10-11 19:39:07 -04:00
Behdad Esfahbod 654365dc89 Pass indic3 tags to USE shaper
Fixes https://github.com/harfbuzz/harfbuzz/issues/539
2018-10-11 17:51:21 -04:00
David Corbett 28d091d045 Parse Indic3 tags 2018-10-11 17:44:13 -04:00
Behdad Esfahbod 2c824d3644 [aat] Fix two wrongs that made a right before!
Unfortunately our static asserts (DEFINE_SIZE_STATIC) don't actually
fail when used in templates, thanks to SFINAE.  Le sighs.

Probably fixes https://oss-fuzz.com/v2/testcase-detail/5740171484463104
2018-10-11 16:43:05 -04:00
Behdad Esfahbod e940530c97 [aat] Fix mul overflow
Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=10897
2018-10-11 15:56:17 -04:00
Behdad Esfahbod 0744a02cb1 [arabic] Update to latest UTR#53
From Lorna Evans: "That was a new character added to Unicode 11.0"
2018-10-11 15:14:18 -04:00
Behdad Esfahbod 4f9e36e8cf [graphite] Remove deprecated symbol use 2018-10-11 14:32:59 -04:00
Behdad Esfahbod da591f2a9d Whitespace 2018-10-11 14:30:15 -04:00
Behdad Esfahbod 4d205f0462 [graphite] Fix deva/dev2 resolution
See https://github.com/harfbuzz/harfbuzz/pull/730#issuecomment-428277800
2018-10-11 14:25:48 -04:00
Behdad Esfahbod 8061664ad1 Add doc stubs for recently added API
Thanks to David Corbett who revamped our script and language processing
and implemented full BCP 47 support.

https://github.com/harfbuzz/harfbuzz/pull/730

New API:
+hb_ot_layout_table_select_script()
+hb_ot_layout_script_select_language()
+HB_OT_MAX_TAGS_PER_SCRIPT
+HB_OT_MAX_TAGS_PER_LANGUAGE
+hb_ot_tags_from_script_and_language()
+hb_ot_tags_to_script_and_language()

Deprecated API:
-hb_ot_layout_table_choose_script()
-hb_ot_layout_script_find_language()
-hb_ot_tags_from_script()
-hb_ot_tag_from_language()
2018-10-11 14:17:17 -04:00
Behdad Esfahbod cf975ac653 Remove use of deprecated function 2018-10-11 14:07:44 -04:00
David Corbett 66790d64c7 Increase HB_OT_MAX_TAGS_PER_SCRIPT to 3
No script has 3 tags yet, but the plan is for the Indic scripts to each
get a third tag someday.
2018-10-11 13:54:28 -04:00
David Corbett bca7a16938 Update language system tag registry to OT 1.8.3 2018-10-11 13:54:28 -04:00
David Corbett 7f1fbfe2e3 Add hb_ot_tags_to_script_and_language 2018-10-11 13:54:28 -04:00
David Corbett 3f8877473f Switch on the first char of a complex language tag
This results in a tenfold speed-up for the common case of tags that are
not complex, in the sense of `hb_ot_tags_from_complex_language`.
2018-10-11 13:54:28 -04:00
David Corbett a754d44195 Map Quechua languages to closest ones with tags
OpenType only officially maps four ISO 639 codes to Quechua languages,
but prior versions of HarfBuzz also mapped qu to 'QUZ '. Because qu is a
macrolanguage, the mapping now applies to all individual Quechua
languages. OpenType calls 'QUZ ' "Quechua", but it really corresponds to
Cusco Quechua, so the individual Quechua languages should not all
necessarily be mapped to it.
2018-10-11 13:54:28 -04:00
David Corbett 7c7cb2a989 Match extlang subtags
If the second subtag of a BCP 47 tag is three letters long, it denotes
an extended language. The tag converter ignores the language subtag and
uses the extended language instead.

There are some grandfathered exceptions, which are handled earlier.
2018-10-11 13:54:28 -04:00
David Corbett 2f1f961cc0 Autogenerate the BCP 47 to OpenType mappings
The new script, gen-tag-table.py, generates `ot_languages` automatically
from the [OpenType language system tag registry][ot] and the [IANA
Language Subtag Registry][bcp47] with some manual modifications. If an
OpenType tag maps to a BCP 47 macrolanguage, all the macrolanguage's
individual languages are mapped to the same OpenType tag, except for
individual languages with their own OpenType mappings. Deprecated
BCP 47 tags are canonicalized.

[ot]: https://docs.microsoft.com/en-us/typography/opentype/spec/languagetags
[bcp47]: https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry

Some OpenType tags correspond to multiple ISO 639 codes. The mapping
from ISO 639 codes lists OpenType tags in priority order, such that more
specific or more likely tags appear first.

Some OpenType tags have no corresponding ISO 639 code in the registry so
their mappings use BCP 47 subtags besides the language. For example, any
BCP 47 tag with a fonipa variant subtag is mapped to 'IPPH', and 'IPPH'
is mapped back to und-fonipa.

Other OpenType tags have no corresponding ISO 639 code because it is not
clear what they are for. HarfBuzz just ignores these tags.

One such ignored tag is 'ZHP ' (Chinese Phonetic). It probably means
zh-Latn. However, it is used in Microsoft JhengHei and Microsoft YaHei
with the script tag 'hani', implying that it is not a romanization
scheme after all. It would be simple enough to add this mapping to
gen-tag-table.py once a definitive mapping is determined.

The manual modifications are mainly either obvious mappings that the
OpenType registry omits or mappings for compatibility with previous
versions of HarfBuzz. Some of the old mappings were discarded, though,
for homophonous language names. For example, OpenType maps 'KUI ' to
kxu; previous versions of HarfBuzz also mapped it to kvd, because kvd
and kxu both happen to be called "Kui".

gen-tag-table.py also generates a function to convert multi-subtag tags
like el-polyton and zh-HK to OpenType tags, replacing `ot_languages_zh`
and the hard-coded list of special cases in `hb_ot_tags_from_language`.
It also generates a function to convert OpenType tags to BCP 47,
replacing the hard-coded list of special cases in
`hb_ot_tag_to_language`.
2018-10-11 13:54:28 -04:00
David Corbett 2c7d4db7af Deprecate obsolete functions
`hb_ot_tags` replaces `hb_ot_tags_from_script` and
`hb_ot_tag_from_language`.

`hb_ot_layout_table_select_script` replaces
`hb_ot_layout_table_choose_script`.

`hb_ot_layout_script_select_language` replaces
`hb_ot_layout_script_find_language`.
2018-10-11 13:54:28 -04:00
David Corbett 91067716f5 Refactor the selection of script and language tags
The old hb-ot-tag.cc functions, `hb_ot_tags_from_script` and
`hb_ot_tag_from_language`, are now wrappers around a new function:
`hb_ot_tags`. It converts a script and a language to arrays of script
tags and language tags. This will make it easier to add new script tags
to scripts, like 'dev3'. It also allows for language fallback chains;
nothing produces more than one language yet though.

Where the old functions return the default tags 'DFLT' and 'dflt',
`hb_ot_tags` returns an empty array. The caller is responsible for
using the default tag in that case.

The new function also adds a new private use subtag syntax for script
overrides: "x-hbscabcd" requests a script tag of 'abcd'.

The old hb-ot-layout.cc functions,`hb_ot_layout_table_choose_script` and
`hb_ot_layout_script_find_language` are now wrappers around the new
functions `hb_ot_layout_table_select_script` and
`hb_ot_layout_script_select_language`. They are essentially the same as
the old ones plus a tag count parameter.

Closes #495.
2018-10-11 13:54:28 -04:00
David Corbett a03f5f4dfb Replace "ISO 639" with "BCP 47"
`hb_language_from_string` accepts not only ISO 639 but also BCP 47. Not
all ISO 639 codes are valid BCP 47 tags but the function does not accept
overlong language subtags anyway.
2018-10-11 13:54:28 -04:00
Behdad Esfahbod 0b9d60e1a1 [aat] Apply kerx if GPOS kern was not applied
Ned tells me this is what Apple does.
2018-10-11 13:26:58 -04:00
Behdad Esfahbod b59a428af0 Minor 2018-10-11 13:24:17 -04:00
Behdad Esfahbod 04f72e8990 [trak] Implement extrapolation
This concludes trak, as well as AAT shaping support!
2018-10-11 11:25:07 -04:00
Behdad Esfahbod d6a12dba6d [trak] Fix, and hook up
Works beautifully!  Test coming.
2018-10-11 11:10:06 -04:00
Behdad Esfahbod 3d7dea6dfd [trak] Handle nSizes=0 and 1 2018-10-11 10:32:08 -04:00
Behdad Esfahbod 451f3de521 [trak] Fix counting 2018-10-11 10:30:32 -04:00
Behdad Esfahbod a5be380cae [trak] More 2018-10-11 10:29:02 -04:00
Behdad Esfahbod d06c4a867f [trak] Only adjust around first glyph
Assumes graphemes only have one base glyph.
2018-10-11 10:22:01 -04:00
Behdad Esfahbod 071a2cbcdd [trak] Clean up 2018-10-11 10:18:46 -04:00
Behdad Esfahbod fbbd926dba [kerx] Implement Format4 action_type=1 contour-point-based attachment
Untested.

This concludes kerx table support!
2018-10-11 01:22:29 -04:00
Behdad Esfahbod b6bc0d4ff6 [kerx] Implement Format4 action_type=2 coordinate-based attachment
Untested.
2018-10-11 01:17:57 -04:00
Behdad Esfahbod 1622ba5943 [kerx] Implement Format4 'ankr'-based mark attachment
Tested with Kannada MN:

$ HB_OPTIONS=aat ./hb-shape Kannada\ MN.ttc -u 0CCD,0C95,0CD6
[kn_ka.vattu=0+230|kn_ai_length_mark=1@326,0+607]
2018-10-11 01:17:33 -04:00
Behdad Esfahbod 7bb4da7d95 [aat] Wire up 'ankr' table to apply context 2018-10-11 00:52:07 -04:00
Behdad Esfahbod 28f0367aab [kerx] Flesh out Format4
Doesn't apply actions yet.
2018-10-11 00:46:12 -04:00
Behdad Esfahbod 947962a287 [ankr] Implement table access 2018-10-10 23:07:03 -04:00
Behdad Esfahbod 7281cb3eeb [ankr] Start fixing 2018-10-10 22:56:52 -04:00
Behdad Esfahbod 34caadc5c7 Ugh. Re-enable accidentally disabled GPOS 2018-10-10 22:17:07 -04:00
Behdad Esfahbod f7c45bc33e [kerx] Allow granularly disabling kerning 2018-10-10 22:15:13 -04:00
Behdad Esfahbod 2b72c4b63d [kerx] Comment 2018-10-10 21:53:14 -04:00
Behdad Esfahbod 9f450f07b0 [kerx] Make Format1 work
Tested using Kannada MN:

$ HB_OPTIONS=aat ./hb-shape Kannada\ MN.ttc -u 0C95,0CCd,C95,CCD
[kn_ka.virama=0+1299|kn_ka.vattu=0+115|_blank=0@-115,0+385]

$ HB_OPTIONS=aat ./hb-shape Kannada\ MN.ttc -u 0C95,0CCd,C95,CCD --features=-kern
[kn_ka.virama=0+1799|kn_ka.vattu=0+230|_blank=0+0]

I don't see the GPOS table in the font do the same.  ¯\_(ツ)_/¯
2018-10-10 21:46:58 -04:00
Behdad Esfahbod 504cb68fc9 Disable mark advance zeroing as well as mark fallback positioning if doing kerx 2018-10-10 21:29:46 -04:00
Behdad Esfahbod 8496753796 [kerx] Implement Format1
Untested.
2018-10-10 21:18:37 -04:00
Behdad Esfahbod c9165f5450 [kerx] More UnsizedArrayOf<> 2018-10-10 20:43:21 -04:00
Behdad Esfahbod ca54eba484 [kerx] Fix bound-checking error introduced a couple commits past 2018-10-10 20:41:16 -04:00
Behdad Esfahbod 339036dd97 [kerx] Start fleshing out Format1 2018-10-10 20:37:22 -04:00
Behdad Esfahbod ab1f30bd05 [kerx] Implement Format6
Untested.  The only Apple font shipping with this format is San Francisco fonts
that use this for their kerx variation tables, which we don't support.
2018-10-10 20:10:20 -04:00
Behdad Esfahbod c9a2ce9e05 [kerx] Move bounds-checking to subtable length itself 2018-10-10 20:00:44 -04:00
Behdad Esfahbod 22955b23cd [kerx] Start fleshing out Format6 2018-10-10 19:58:20 -04:00
Behdad Esfahbod f6aaad9b4f [kerx] When rejecting variable kerning, also check for tupleCount 2018-10-10 19:20:06 -04:00
Behdad Esfahbod 7ed5366d3c [kerx] No-op
Tested that Format0 works with Kannada MN font:

$ make -j5 lib -s && HB_OPTIONS=aat ./hb-shape Kannada\ MN.ttc -u 0C95,0CC2
[kn_ka=0+1000|kn_matra_uu=0@-30,0+1345]

$ make -j5 lib -s && HB_OPTIONS=aat ./hb-shape Kannada\ MN.ttc -u 0C95,0CC2 --features=-kern
[kn_ka=0+1030|kn_matra_uu=0+1375]

Note that GPOS does the same with 'dist' feature, and applies the whole difference to the
same glyph:

$ make -j5 lib -s && ./hb-shape Kannada\ MN.ttc -u 0C95,0CC2
[kn_ka=0+970|kn_matra_uu=0+1375]

$ make -j5 lib -s && ./hb-shape Kannada\ MN.ttc -u 0C95,0CC2 --features=-dist
[kn_ka=0+1030|kn_matra_uu=0+1375]
2018-10-10 19:12:27 -04:00
Behdad Esfahbod 7fa69e92ca Comment 2018-10-10 19:02:32 -04:00
Behdad Esfahbod 7e6e5bf614 Fix option string matching 2018-10-10 18:59:07 -04:00
Behdad Esfahbod 5d34164d98 [kern/kerx] Fix offset base
Disable kern Format2.

Fix kerx Format2.  Manually tested this with Tamil MN font and it works:

$ HB_OPTIONS=aat ./hb-shape Tamil\ MN.ttc -u 0B94,0B95
[tgv_au=0+3435|tgc_ka=1@-75,0+1517]

 HB_OPTIONS=aat ./hb-shape Tamil\ MN.ttc -u 0B94,0B95 --features=-kern
[tgv_au=0+3510|tgc_ka=1+1592]
2018-10-10 18:23:09 -04:00
Behdad Esfahbod 60f86d32d7 [kerx] Don't loop over kerning subtables if kerning disabled 2018-10-10 18:10:05 -04:00
Behdad Esfahbod 38a7a8a89e Allow HB_OPTIONS=aat to prefer AAT tables over OT
Fixes https://github.com/harfbuzz/harfbuzz/issues/322
2018-10-10 17:44:46 -04:00
Behdad Esfahbod 44f09afd5b [kerx] Skip variation subtables 2018-10-10 17:32:32 -04:00
Behdad Esfahbod 1e8fdd285f Remove HAVE_OT
We never tested compiling without it.  Just kill it.  We always build
our own shaper.
2018-10-10 16:32:35 -04:00
Behdad Esfahbod 7727e73756 [kerx] Actually hook up, and fix crash 2018-10-10 13:24:51 -04:00
Behdad Esfahbod b3390990f5 Add per-subtable set-digests
This speeds up Roboto shaping by ~10%.  I was hoping for more.
Still, good defense against lookups with many subtables.
2018-10-10 12:13:25 -04:00