Protect cache against future expansions of FcLangSet (adding new
orth files). Previously, doing so could change the size of
that struct. Indeed, that happened between 2.6.0 and 2.7.3, causing
crashes. Unfortunately, sizeof(FcLangSet) was not checked in fcarch.c.
This changes FcLangSet code to be able to cope with struct size changes.
And change cache format, hence bumping from 2 to 3.
Before a NULL config was passed down adn essentially FcFileScan was
equivalent to FcFreeTypeQuery. Now fc-scan tool correctly applies
the configuration to the scanned patterns.
The East Asian double-byte codepages have characters with backslash as
the second byte, so we must use _mbsrchr() instead of strrchr() when
looking at pathnames in the system codepage.
Must not call FcStrFree() on a value returned by
FcStrBufDoneStatic(). In the Windows code don't bother with dynamic
allocation, just use a local buffer.
Fontconfig assigns an index number to each language it knows about.
The index is used to index a bit in FcLangSet language map. The bit
map is stored in the cache.
Previously fc-lang simply sorted the list of languages and assigned
them an index starting from zero. Net effect is that whenever new
orth files were added, all the FcLangSet info in the cache files would
become invalid. This was causing weird bugs like this one:
https://bugzilla.redhat.com/show_bug.cgi?id=490888
With this commit we fix the index assigned to each language. The index
will be based on the order the orth files are passed to fc-lang. As a
result all orth files are explicitly listed in Makefile.am now, and
new additions should be made to the end of the list. The list is made
to reflect the sorted list of orthographies from 2.6.0 released followed
by new additions since.
This fixes the stability problem. Needless to say, recreating caches
is necessary before any new orthography is recognized in existing fonts,
but at least the existing caches are still valid and don't cause bugs
like the above.
The format '%{[]family,familylang{expr}}' expands expr once for the first
value of family and familylang, then for the second, etc, until both lists
are exhausted.
The '%{=unparse}' format expands to the FcNameUnparse() result on the
pattern. Need to add '%{=verbose}' for FcPatternPrint() output but
need to change that function to output to a string first.
Also added the '%{=fclist}' and '%{=fcmatch}' which format like the
default format of fc-list and fc-match respectively.
The format '%{family|delete( )}' expands to family values with space removed.
The format '%{family|translate( ,-)}' expands to family values with space
replaced by dash. Multiple chars are supported, like tr(1).
The format '%{family|escape(\\ )}' expands to family values with space
escaped using backslash.
The format '%{family|downcase}' for example prints the lowercase of
the family element. Three converters are defined right now:
'downcase', 'basename', and 'dirname'.
The conditional '%{?elt1,elt2,!elt3{expr1}{expr2}}' will evaluate
expr1 if elt1 and elt2 exist in pattern and elt3 doesn't exist, and
expr2 otherwise. The '{expr2}' part is optional.
The filtering, '%{+elt1,elt2,elt3{subexpr}}' will evaluate subexpr
with a pattern only having the listed elements from the surrounding
pattern.
The deletion, '%{-elt1,elt2,elt3{subexpr}}' will evaluate subexpr
with a the surrounding pattern sans the listed elements.
Diego Santa Cruz pointed out that we are using that API wrongly.
The forth argument is a pointer to a pointer. Turns out we don't
need that arugment and it accepts NULL, so just pass that.
To only work on writable charsets. Also, return a bool indicating whether
the merge changed the charset.
Also changes the implementation of FcCharSetMerge and FcCharSetIsSubset
Previously an index j was added to element score to prefer matches earlier
in the value list to the later ones. This index started from 0, meaning
that the score zero could be generated for the first element. By starting
j from one, scores for when the element exists in both pattern and font
can never be zero. The score zero is reserved for when the element is
NOT available in both font and pattern. We will use this property later.
This shouldn't change matching much. The only difference I can think of
is that if a font family exists both as a bitmap font and a scalable
version, and when requesting it at the size of the bitmap version,
previously the font returned was nondeterministic. Now the scalable
version will always be preferred.
Previously the matcher multiplied comparison results by 100 and added
index value to it. With long lists of families (lots of aliases),
reaching 100 is not that hard. That could result in a non-match early
in the list to be preferred over a match late in the list. Changing
the multiplier from 100 to 1000 should fix that.
To keep things relatively in order, the lang multiplier is changed
from 1000 to 10000.
Previously fc-match "xxx,nazli" matched Nazli, but "xxx, nazli" didn't.
This was because of a bug in FcCompareFamily's short-circuit check
that forgot to ignore spaces.
I can't understand why the special case is needed. Indeed, removing it
does not make any difference in the "fc-match --verbose" output, and
that's the only time fc-match uses FcPatternPrint.
Two changes:
- after mkdir(), we immediately chmod(), such that we are not affected
by stupid umask's.
- if a directory we want to use is not writable but exists, we try a
chmod on it. This is to recover from stupid umask's having affected
us with older versions.
The current behaviour of FcSortWalk() is to create a new FcCharSet on
each iteration that is the union of the previous iteration with the next
FcCharSet in the font set. This causes the existing FcCharSet to be
reproduced in its entirety and then allocates fresh leaves for the new
FcCharSet. In essence the number of allocations is quadratic wrt the
number of fonts required.
By introducing a new method for merging a new FcCharSet with an existing
one we can change the behaviour to be effectively linear with the number
of fonts - allocating no more leaves than necessary to cover all the
fonts in the set.
For example, profiling 'gedit UTF-8-demo.txt'
Allocator nAllocs nBytes
Before:
FcCharSetFindLeafCreate 62886 2012352
FcCharSetPutLeaf 9361 11441108
After:
FcCharSetFindLeafCreate 1940 62080
FcCharSetPutLeaf 281 190336
The savings are even more significant for applications like firefox-3.0b5
which need to switch between large number of fonts.
Before:
FcCharSetFindLeafCreate 4461192 142758144
FcCharSetPutLeaf 1124536 451574172
After:
FcCharSetFindLeafCreate 80359 2571488
FcCharSetPutLeaf 18940 9720522
Out of interest, the next most frequent allocations are
FcPatternObjectAddWithBinding 526029 10520580
tt_face_load_eblc 42103 2529892
Note that this also fixes a bug with FcFontList() where previously
it was NOT checking whether the config is up-to-date. May want to
keep the old behavior and document that ScanInterval is essentially
unused internally (FcFontSetList uses it, but we can remove that
too).