See thread "Issue with cursive attachment" started by Khaled.
Turned out fixing this wasn't as bad as I had assumed. I like the
new code better; we now have a theoretical model of cursive
connections that is easier to reason about.
This makes a lot of code safer. We only try modifying the object in one
place, after making sure it's safe to do so. So, do a const_cast<> in
that one place...
Currently:
- Initializing skippy is very expensive,
- Our lookup accelerator (using set-digests) can be very ineffecite,
As such, we end up many times initializing skippy but then failing
coverage check. Reordering fixes that.
When, later, we fix our accelerator to have truly small false-positive
rate (for example by using the frozen-sets), then we might want to
reorder these checks such that we wouldn't calculate coverage number
if skippy is going to fail.
This shows a 5% speedup with Roboto already.
Roboto has glyphs (like 'F') that have 200 kerning pairs.
Add a handcoded bsearch instead of previous linear search.
This doesn't show much speedup though, apparently we spend the
bulk of the time somewhere before here.
Before we were just relying on the compiler inlining them and not
leaving a trace in our public API. Try to fix. Hopefully not
breaking anyone's build.
When matching lookups, be smart about default-ignorable characters.
In particular:
Do nothing specific about ZWNJ, but for the other default-ignorables:
If the lookup in question uses the ignorable character in a sequence,
then match it as we used to do. However, if the sequence match will
fail because the default-ignorable blocked it, try skipping the
ignorable character and continue.
The most immediate thing it means is that if Lam-Alef forms a ligature,
then Lam-ZWJ-Alef will do to. Finally!
One exception: when matching for GPOS, or for backtrack/lookahead of
GSUB, we ignore ZWNJ too. That's the right thing to do.
It certainly is possible to build fonts that this feature will result
in undesirable glyphs, but it's hard to think of a real-world case
that that would happen.
This *does* break Indic shaping right now, since Indic Unicode has
specific rules for what ZWJ/ZWNJ mean, and skipping ZWJ is breaking
those rules. That will be fixed in upcoming commits.