Commit Graph

57 Commits

Author SHA1 Message Date
Behdad Esfahbod c4e4f1d387 [indic-generator] Move SMVD position overrides to generator 2022-06-09 11:58:37 -06:00
Behdad Esfahbod 2963154c15 [indic-generator] Add a couple comments 2022-06-09 11:53:24 -06:00
Behdad Esfahbod 91d6f45bc9 [indic-generator] Move some position overrides to the generator 2022-06-09 11:52:56 -06:00
Behdad Esfahbod 0ec4dcb93d [indic-generator] Ouch
Not sure how this was passing tests still.
2022-06-09 11:52:25 -06:00
Behdad Esfahbod f0269e0f1b [indic-generator] Move Ra handling to the generator 2022-06-09 11:52:03 -06:00
Behdad Esfahbod 419d2146c2 [indic-generator] Cap off what categories have positions
This was left off of the commit moving Indic categories to the generator.
It didn't fail any tests, but adding it back because it has implications
possibly.
2022-06-09 11:51:38 -06:00
Behdad Esfahbod e1d965d527 [indic-generator] Move position mapping to generator 2022-06-09 11:51:15 -06:00
Behdad Esfahbod 4907514026 [indic-generator] Move category overrides to generator 2022-06-09 11:50:30 -06:00
Behdad Esfahbod 58eeb3a180 [indic-generator] Move category mapping to generator 2022-06-09 11:49:57 -06:00
Behdad Esfahbod 5bfb0b721c Rename s/shape-complex/shaper/g 2022-06-03 10:30:34 +01:00
Behdad Esfahbod 676d1e6adf [indic] Spell out INDIC_TABLE_ELEMENT_TYPE 2021-02-01 11:30:39 -08:00
Ebrahim Byagowi 6937092a66 [py] apply lgtm.com python suggestions 2020-07-13 23:37:52 +04:30
Ebrahim Byagowi 82c6ddb986 [py] remove not needed imports 2020-07-03 15:51:13 +04:30
Ebrahim Byagowi ad87155fd0 minor, use py3's open(encoding=) 2020-05-29 00:11:19 +04:30
Ebrahim Byagowi 7554f618ec minor, use sys.exit print shorthand 2020-05-28 23:34:37 +04:30
Ebrahim Byagowi 08f1d95a50 minor, move scripts manuals to __doc__ 2020-05-28 15:13:12 +04:30
David Corbett fd748fac41 Update to Unicode 13.0.0 2020-04-29 17:17:03 -04:00
Ebrahim Byagowi 8d19907704 Remove python2 support from tests/utils scripts 2020-02-19 16:17:45 +03:30
Ebrahim Byagowi 6a390df8af [tools] Print unicode links on gen-* tools output
As Behdad's review
2020-02-10 17:20:09 +03:30
Evgeniy Reizner 4dc87365d7 Add links to files used by python scripts.
Closes #2150
2020-02-09 20:52:49 +03:30
Adrian Wong b66076812d Adjustments to the generated Indic table output (#1936)
* Add empty parentheses after print call

* Minor: newlines. Move #pragma pop down one; #endif up one

* Adjust #define ISC/IMC output

* Regenerate Indic table
2019-08-28 04:31:27 -07:00
Behdad Esfahbod 7aad53657e [config] Add HB_NO_OT_SHAPE / HB_NO_OT
Part of https://github.com/harfbuzz/harfbuzz/issues/1652
2019-06-26 13:21:03 -07:00
David Corbett 8c42f03215 Remove obsolete overrides from Indic/USE scripts 2019-03-11 16:07:52 -07:00
Behdad Esfahbod 8874eef8ff Add pragram GCC diagnostic ignored "-Wunused-macros" 2019-01-17 15:04:44 -05:00
Behdad Esfahbod c77ae40852 Rename hb-*private.hh to hb-*.hh
Sorry for the noise, downstream custom builders.  Please adjust.
2018-08-25 22:36:36 -07:00
Ebrahim Byagowi 80395f14e8
Make gen-* scripts LC_ALL=C compatible (#942) 2018-03-29 22:00:41 +04:30
Ebrahim Byagowi 26e0cbd834
Actual py3 compatibility making on gen-* scripts (#941) 2018-03-29 21:22:47 +04:30
Ebrahim Byagowi cab2c2c08c
Make more gen-* scripts py3 compatible (#940) 2018-03-29 12:48:47 +04:30
Behdad Esfahbod 308f419215 [use] Fix Brahmi Number Joiner 1107F
Fixes https://github.com/harfbuzz/harfbuzz/pull/660
2018-01-03 14:22:07 +00:00
Behdad Esfahbod 216b003c91 [use] Fix shaping of U+AA29 CHAM VOWEL SIGN AA
Part of https://github.com/behdad/harfbuzz/issues/376
Also see https://github.com/roozbehp/unicode-data/issues/6

Test added, using NotoSansCham built from Noto Phase III sources.
2017-07-14 16:38:51 +01:00
Behdad Esfahbod 30e6e29f0f [indic/use] Move Javanese from Indic shaper to USE
Fixes https://github.com/behdad/harfbuzz/issues/243

With javatext.ttf, the reodering medial Ra gets its advance width
zero'ed in Uniscribe implementation, and the font adds the advance
back.  Our Indic shaper does not do that, but USE does.  So, route
Javanese through USE.  That's what Microsoft does anyway.  Test:

  U+A9A5,U+A9BA

This also seems to fix the following sequence, and variations thereof:

  U+A99F,U+A9C0,U+A9A2,U+A9BF
2016-05-06 15:52:27 +01:00
Behdad Esfahbod 01a30a6aa9 [indic] Remove data for scripts that don't go thorough this shaper 2016-05-06 12:10:07 +01:00
Behdad Esfahbod f718fe370e Minor 2016-05-06 12:10:00 +01:00
Behdad Esfahbod 2813e3049a [indic] Update data tables to Unicode 8.0
Test stats remain unchanged, except for Malayalam, which we investigate:

BENGALI: 353725 out of 354188 tests passed. 463 failed (0.130722%)
DEVANAGARI: 707307 out of 707394 tests passed. 87 failed (0.0122987%)
GUJARATI: 366349 out of 366457 tests passed. 108 failed (0.0294714%)
GURMUKHI: 60732 out of 60747 tests passed. 15 failed (0.0246926%)
KANNADA: 951190 out of 951913 tests passed. 723 failed (0.0759523%)
KHMER: 299070 out of 299124 tests passed. 54 failed (0.0180527%)
MALAYALAM: 1047584 out of 1048334 tests passed. 750 failed (0.0715421%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271662 out of 271847 tests passed. 185 failed (0.068053%)
TAMIL: 1091753 out of 1091754 tests passed. 1 failed (9.15957e-05%)
TELUGU: 970555 out of 970573 tests passed. 18 failed (0.00185457%)

Myanmar, compared to Windows 10 mmrtext.ttf:

MYANMAR: 1123865 out of 1123883 tests passed. 18 failed (0.00160159%)
2015-12-18 11:05:11 +00:00
Behdad Esfahbod 1aaa7d6799 [indic] Fix out-of-bounds access 2015-01-17 20:16:56 -08:00
Behdad Esfahbod c09a607a84 Use hb_in_range() for arabic and indic tables
Though, looks like gcc was smart enough to produce the same code
before...
2014-07-11 16:22:13 -04:00
Behdad Esfahbod d743ce78e1 [indic-table] Update to Unicode 7.0 data
Touch code just enough to preserve previous syllable structure
and functionality as closely as possible.  Many further cleanups
coming later.
2014-06-30 15:24:45 -04:00
Behdad Esfahbod 5fa21b3ab7 [indic-table] Fix category frequency counts in comments 2014-06-30 14:30:54 -04:00
Behdad Esfahbod 89e4946929 Add new IndicSyllabicCategory short forms for Unicode 7.0 2014-06-22 11:32:13 -06:00
Behdad Esfahbod dcee838e89 Minor 2014-06-22 11:29:59 -06:00
Behdad Esfahbod f2ad86e605 [indic-table-gen] Minor 2014-06-21 15:31:10 -06:00
Behdad Esfahbod a133e6067a [indic-table] Minor 2014-06-20 18:01:34 -04:00
Behdad Esfahbod c2e1134046 [indic-table] Make output stable 2014-06-20 17:57:03 -04:00
Behdad Esfahbod 55abfbd2ac [indic-table] Minor
No output change.
2014-06-20 16:47:43 -04:00
Behdad Esfahbod 171f970e4f [indic-table] Black-list Thai, Lao, and Tibetan
We don't need Indic table for those.
2014-06-20 15:30:29 -04:00
Behdad Esfahbod 65ac2dae4f [indic-table] Speed up lookup 2014-06-20 15:29:38 -04:00
Behdad Esfahbod 64442a3f4c [indic-table] Fix compiler warning 2014-06-20 15:29:21 -04:00
Behdad Esfahbod 0436e1d505 [indic-table] Make table more compact by not covering full blocks
-#define indic_offset_total 4416
+#define indic_offset_total 3816

-}; /* Table occupancy: 60% */
+}; /* Table occupancy: 69% */
2014-06-20 15:28:38 -04:00
Behdad Esfahbod 190a251479 [indic-table] Remove block range from data table
No functional change.
2014-06-20 14:42:03 -04:00
Behdad Esfahbod 3a83d33ec0 Add South-East Asian shaper
Handles Tai Tham, Cham, and New Tai Lue for now.
2013-02-12 12:14:10 -05:00