Commit Graph

137 Commits

Author SHA1 Message Date
Behdad Esfahbod d3d0cbd278 Typo 2018-10-03 15:31:33 +02:00
Behdad Esfahbod d992982d23 [indic] Add some confusable sequences from Unicode Standard 2018-10-02 17:31:11 +02:00
Behdad Esfahbod 2d6edc9008 [test] Add Khmer test texts from recent bugs 2018-07-31 13:56:55 -07:00
Behdad Esfahbod df26a32c8f [test] Move things around for shaper updates 2018-07-31 13:55:53 -07:00
Behdad Esfahbod 44c65eee28 [test] Reorganize test suite
In anticipation of importing more test suites.
2018-01-10 02:50:49 +01:00
Behdad Esfahbod 30e6e29f0f [indic/use] Move Javanese from Indic shaper to USE
Fixes https://github.com/behdad/harfbuzz/issues/243

With javatext.ttf, the reodering medial Ra gets its advance width
zero'ed in Uniscribe implementation, and the font adds the advance
back.  Our Indic shaper does not do that, but USE does.  So, route
Javanese through USE.  That's what Microsoft does anyway.  Test:

  U+A9A5,U+A9BA

This also seems to fix the following sequence, and variations thereof:

  U+A99F,U+A9C0,U+A9A2,U+A9BF
2016-05-06 15:52:27 +01:00
Behdad Esfahbod c6ee5f5f06 Add Javanese sample text 2016-05-06 15:39:02 +01:00
Behdad Esfahbod f68390f196 [test] Add text for Tibetan shorthand contractions
From http://www.babelstone.co.uk/Tibetan/Contractions.html
2016-04-27 02:44:35 -07:00
Behdad Esfahbod f8ee7906d0 Remove MANIFEST files
They are unused currently.  We can add later if we hook them up
to anything useful.
2016-02-23 13:45:38 +09:00
Behdad Esfahbod 99d3495576 [test] Add test text for Kaithi 2016-01-06 12:21:54 +00:00
Behdad Esfahbod 59821ab8b4 [arabic] Don't stretch over cased letters
Addresses
6e6f82b6f3 (commitcomment-14248516)
2015-11-06 16:27:44 -08:00
Behdad Esfahbod 6e6f82b6f3 Implement SYRIAC ABBREVIATION MARK with 'stch' feature
The feature is enabled for any character in the Arabic shaper.
We should experiment with using it for Arabic subtending marks.
Though, that has a directionality problem as well, since those
are used with digits...

Fixes https://github.com/behdad/harfbuzz/issues/141
2015-11-05 17:46:34 -08:00
Behdad Esfahbod 786ba45847 [test] Encode Kharoshti text
Ouch!
2015-07-23 13:04:34 +01:00
Behdad Esfahbod b423125503 [test] Add Batak and Buginese test texts 2015-07-23 13:01:55 +01:00
Behdad Esfahbod b8c159ffcc [test] Remove shaper-sea texts under shaper-use 2015-07-23 13:00:44 +01:00
Behdad Esfahbod 67ba7320cc [test] Remove New Tai Lue texts
New Tai Lue changed encoding to visual, boring, model.
2015-07-23 12:58:21 +01:00
Behdad Esfahbod 14b12f92a9 [USE] Add Kharoshti test data from Unicode proposal 2015-07-20 11:57:44 +01:00
Behdad Esfahbod fc0daafab0 [indic] Handle old-spec Malayalam reordering with final Halant
See comment.

Micro-tests added.
2014-07-23 16:53:03 -04:00
Behdad Esfahbod ed29b15f5d [test] Add more Mongolian variation selector tests
From
https://code.google.com/p/chromium/issues/detail?id=393896
2014-07-18 14:37:49 -04:00
Behdad Esfahbod 7977ca17aa [indic] Allow decimal and Brahmi digits as placeholders
Tests: U+0967,0951 U+0031,093F
2014-05-29 15:34:26 -04:00
Behdad Esfahbod e8b5d64039 [indic] Do NOT allow reph formation on placeholders
Only allow it on DOTTED CIRCLE.  No effect on test numbers.

Test: U+0930,094D,00A0
2014-05-29 15:20:15 -04:00
Behdad Esfahbod 0a017ce169 Add tests for Myanmar Asat+MedialYa and MedialYa+Asat sequences
One of them currently produces dotted-circle.  Fix and detailed
message coming.
2014-05-14 16:44:16 -06:00
Behdad Esfahbod 659cd3c5b4 [test] Add test case for Tibetan sign PADMA
Currently fails.
2014-04-28 12:44:14 -07:00
Behdad Esfahbod ee703bc3ef Reshuffle test data 2014-04-28 12:44:14 -07:00
Behdad Esfahbod 897c7b804d Add Khmer test for U+17DD 2014-04-10 16:27:13 -07:00
Behdad Esfahbod 2a473338da Add Myanmar test case from OpenType Myanmar spec 2014-03-10 15:04:46 -07:00
Behdad Esfahbod 1589859089 Minor 2014-03-10 14:57:55 -07:00
Behdad Esfahbod 9af91ca8ff Add more Myanmar test cases
All three are broken right now according to Roozbeh.

https://bugs.freedesktop.org/show_bug.cgi?id=71947
https://bugs.freedesktop.org/show_bug.cgi?id=71948
https://bugs.freedesktop.org/show_bug.cgi?id=71949
2013-11-25 17:47:19 -05:00
Behdad Esfahbod 5c558877da [indic] Allow up to two syllable modifiers
Bug 70509 - Candrabindu+Visarga doesn't work in Devanagari
https://bugs.freedesktop.org/show_bug.cgi?id=70509

We categorize both bindus and visarga as syllable-modifiers.
OT spec doesn't actually say what characters go in the syllable
modifier category, and allows one.  We just allow up to two now.

Test case: U+0930,U+0941,U+0901,U+0903

Uniscribe currently doesn't support that and produces a
dotted circle.
2013-10-16 11:18:09 +02:00
Behdad Esfahbod 65a929b1c0 [indic] If Malayalam dot-reph formed a ligature, don't move it
Rachana-0.6 implements dot-reph by ligation, so we shouldn't move it.
Uniscribe doesn't either.  Test case:

  U+0D4E,U+0D1A,U+0D4D,U+0D1A,U+0D4D
2013-10-15 18:21:32 +02:00
Behdad Esfahbod 30145272a7 [indic] Don't apply presentation features across syllables
More like Uniscribe...  We still allow user-defined features to
work across syllables, but not pres,blws,abs,psts,etc.

This "regressed" Sinhala numbers by 11.  These are cases were
there's Consonant followed by Ra,Halant,ZWJ at the of text.
The Ra,Halant,ZWJ ends up forming reph, which is wrong...
But before we were also ligating that reph with the previous
consonant.  That's even more wrong.  That's also what Uniscribe
does.

Current numbers:

BENGALI: 353732 out of 354188 tests passed. 456 failed (0.128745%)
DEVANAGARI: 707307 out of 707394 tests passed. 87 failed (0.0122987%)
GUJARATI: 366349 out of 366457 tests passed. 108 failed (0.0294714%)
GURMUKHI: 60732 out of 60747 tests passed. 15 failed (0.0246926%)
KANNADA: 951030 out of 951913 tests passed. 883 failed (0.0927606%)
KHMER: 299070 out of 299124 tests passed. 54 failed (0.0180527%)
MALAYALAM: 1048140 out of 1048334 tests passed. 194 failed (0.0185056%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271655 out of 271847 tests passed. 192 failed (0.070628%)
TAMIL: 1091753 out of 1091754 tests passed. 1 failed (9.15957e-05%)
TELUGU: 970555 out of 970573 tests passed. 18 failed (0.00185457%)
2013-10-15 18:20:59 +02:00
Behdad Esfahbod 3c7b3641cf [indic] Handle Avagraha
It can come either at the end(ish!) of the syllable, or independently.
When independent, it accepts a few bits and pieces.
2013-10-15 13:14:31 +02:00
Behdad Esfahbod cc50bf5b13 Remove Hangul filler characters from Default_Ignorable chars
See discussion on mailing list.
2013-03-19 07:00:41 -04:00
Behdad Esfahbod a8cf7b43fa [Indic] Futher adjust ZWJ handling in Indic-like shapers
After the Ngapi hackfest work, we were assuming that fonts
won't use presentation features to choose specific forms
(eg. conjuncts).  As such, we were using auto-joiner behavior
for such features.  It proved to be troublesome as many fonts
used presentation forms ('pres') for example to form conjuncts,
which need to be disabled when a ZWJ is inserted.

Two examples:

	U+0D2F,U+200D,U+0D4D,U+0D2F with kartika.ttf
	U+0995,U+09CD,U+200D,U+09B7 with vrinda.ttf

What we do now is to never do magic to ZWJ during GSUB's main input
match for Indic-style shapers.  Note that backtrack/lookahead are still
matched liberally, as is GPOS.  This seems to be an acceptable
compromise.

As to the bug that initially started this work, that one needs to
be fixed differently:

  Bug 58714 - Kannada u+0cb0 u+200d u+0ccd u+0c95 u+0cbe does not
  provide same results as Windows8
  https://bugs.freedesktop.org/show_bug.cgi?id=58714

New numbers:

BENGALI: 353689 out of 354188 tests passed. 499 failed (0.140886%)
DEVANAGARI: 707305 out of 707394 tests passed. 89 failed (0.0125814%)
GUJARATI: 366349 out of 366457 tests passed. 108 failed (0.0294714%)
GURMUKHI: 60706 out of 60747 tests passed. 41 failed (0.067493%)
KANNADA: 951030 out of 951913 tests passed. 883 failed (0.0927606%)
KHMER: 299070 out of 299124 tests passed. 54 failed (0.0180527%)
LAO: 53611 out of 53644 tests passed. 33 failed (0.0615167%)
MALAYALAM: 1048102 out of 1048334 tests passed. 232 failed (0.0221304%)
ORIYA: 42320 out of 42329 tests passed. 9 failed (0.021262%)
SINHALA: 271666 out of 271847 tests passed. 181 failed (0.0665816%)
TAMIL: 1091753 out of 1091754 tests passed. 1 failed (9.15957e-05%)
TELUGU: 970555 out of 970573 tests passed. 18 failed (0.00185457%)
TIBETAN: 208469 out of 208469 tests passed. 0 failed (0%)
2013-03-19 06:22:06 -04:00
Behdad Esfahbod 6d69a2cec1 [tests] Add Malayalam tests frim cibu 2013-02-26 21:11:05 -05:00
Behdad Esfahbod e0486fc1af [tests] Add Myanmar torture tests from Martin Hosken 2013-02-19 00:58:10 -05:00
Behdad Esfahbod a3df9a7bf8 Minor
Moving files around
2013-02-19 00:50:46 -05:00
Behdad Esfahbod b1f4407591 [SEA] Fix order of pre-base reordering Ra and left matras
The code was confused because it was expecting left matra to have
POS_PRE_M, like we do in the Myanmar shaper, but that is not what
we were doing in this shaper.  Rewrite to rely on category only.

Test case: U+AA06,U+AA34,U+AA2F
2013-02-17 12:12:37 -05:00
Behdad Esfahbod 05ac87813d [tests] Add Syriac Alaph shaping test cases 2013-02-15 09:26:41 -05:00
Behdad Esfahbod 126f39cd16 Add more dot-reph tests 2013-02-13 08:29:21 -05:00
Behdad Esfahbod f22b7e7778 [Indic] Track base position when reordering things
Ouch, how did things ever work without this?!  The added test that has a
dot-reph as well as a pre-base reordering Ra perfectly demonstrates the
bug (tested with Nirmala font from Win8 for example).  Testing suggests
that Win8 shaper has the *exact* same bug / behavior that we used to
have.  Odd.
2013-02-13 07:32:46 -05:00
Behdad Esfahbod cc5f24cde0 [tests] Add tests for Devanagary Eyelash Ra
Currently broken with Sanskrit 2003 font.
2013-02-12 18:17:12 -05:00
Behdad Esfahbod f60793e854 [tests] Add Cham sample 2013-02-12 15:45:59 -05:00
Behdad Esfahbod 3a83d33ec0 Add South-East Asian shaper
Handles Tai Tham, Cham, and New Tai Lue for now.
2013-02-12 12:14:10 -05:00
Behdad Esfahbod fb96021206 Minor test reshufflings 2013-02-12 10:33:58 -05:00
Behdad Esfahbod 5676d5d527 [Indic] Make sure New Tai Lue works! 2013-02-12 10:31:14 -05:00
Behdad Esfahbod bed687f886 Shuffle test data around 2013-02-11 14:24:03 -05:00
Behdad Esfahbod ecd454b3cd [Indic] In old-spec shaping, don't move viramas around if seq ends with one
For example: u0c9a u0ccd u0c9a u0ccd with Lohit.  See:

https://bugs.freedesktop.org/show_bug.cgi?id=59118
2013-01-08 18:09:46 -06:00
Behdad Esfahbod 8b217f5ac5 [Indic] Reorder Malayalam dot-reph to after base
Test sequence is simple: U+0D4E,U+0D15.  The doth-reph should be
reordered to after the Ka.

https://bugzilla.redhat.com/show_bug.cgi?id=799565
2012-12-21 15:49:26 -05:00
Behdad Esfahbod 624933f676 Add Persian test cases from Mehran Mehr 2012-11-30 11:46:35 +02:00