If pre-base reordering Ra is NOT formed (or formed and then
broken up), we should consider that Ra as base. This is
observable when there's a left matra or dotreph that positions
before base.
Now, it might be that we shouldn't do this if the Ra happend
to form a below form. We can't quite deduce that right now...
Micro test added. Also at:
https://code.google.com/a/google.com/p/noto-alpha/issues/detail?id=186#c29
Sometimes font designers form half/pref/etc consonant forms
unconditionally and then undo that conditionally. Try to
recover the OT_H classification in those cases.
No test number changes expected.
Normally if you want to, say, conditionally prevent a 'pref', you
would use blocking contextual matching. Some designers instead
form the 'pref' form, then undo it in context. To detect that
we now also remember glyphs that went through MultipleSubst.
In the only place that this is used, Uniscribe seems to only care
about the "last" transformation between Ligature and Multiple
substitions. Ie. if you ligate, expand, and ligate again, it
moves the pref, but if you ligate and expand it doesn't. That's
why we clear the MULTIPLIED bit when setting LIGATED.
Micro-test added. Test: U+0D2F,0D4D,0D30 with font from:
[1]
https://code.google.com/a/google.com/p/noto-alpha/issues/detail?id=186#c29
Roboto was hitting this. FreeType also has pretty much the
same code for this, in ttcmap.c:tt_cmap4_validate():
/* in certain fonts, the `length' field is invalid and goes */
/* out of bound. We try to correct this here... */
if ( table + length > valid->limit )
{
if ( valid->level >= FT_VALIDATE_TIGHT )
FT_INVALID_TOO_SHORT;
length = (FT_UInt)( valid->limit - table );
}
Sinhala and Telugu use "explicit" reph. That is, the reph is formed by
a Ra,H,ZWJ sequence. Previously, upon detecting this sequence, we were
checking checking whether the 'rphf' feature applies to the first two
glyphs of the sequence. This is how the Microsoft fonts are designed.
However, testing with Noto shows that apparently Uniscribe also forms
the reph if the lookup ligates all three glyphs. So, try both
sequences.
Doesn't affect test results for Sinhala or Telugu.
https://code.google.com/a/google.com/p/noto-alpha/issues/detail?id=232
The grammar in the OT spec, and the existing Windows implementation
seem to be confused around where to allow Asat around the medial
consonants.
The previous grammar for medial group was allowing an Asat after
the medial group only if there was a medial Wa or Ha, but not if
there was only a medial Ya. This doesn't make sense to me and
sounds reversed, as both medial Wa and Ha are below marks while
Asat is an above mark. An Asat can come before the medial group
already (in fact, multiple ones can. Why?!). The medial Ya
however is a spacing mark and according to Roozbeh it's valid
to want an Asat on the medial Ya instead of the base, so it looks
to me like we want to allow an Asat after the medial group if
there *was* a Ya but not if there wasn't any. Not wanting to
produce dotted-circle where Windows is not, this commit changes
the grammar to allow one Asat after the medial group no matter
what comes in the group.
Test: U+1002,103A,103B vs U+1002,103B,103A