The syntax to add any characters to the charset table looks like:
<match target="scan">
<test name="family">
<string>Buggy Sans</string>
</test>
<edit name="charset" mode="assign">
<plus>
<name>charset</name>
<charset>
<int>0x3220</int> <!-- PARENTHESIZED IDEOGRAPH ONE -->
</charset>
</plus>
</edit>
</match>
To remove any characters from the charset table:
<match target="scan">
<test name="family">
<string>Buggy Sans</string>
</test>
<edit name="charset" mode="assign">
<minus>
<name>charset</name>
<charset>
<int>0x06CC</int> <!-- ARABIC LETTER FARSI YEH -->
<int>0x06D2</int> <!-- ARABIC LETTER YEH BARREE -->
<int>0x06D3</int> <!-- ARABIC LETTER YEH BARREE WITH HAMZA ABOVE -->
</charset>
</minus>
</edit>
</match>
You could also use the range element for convenience:
...
<charset>
<int>0x06CC</int> <!-- ARABIC LETTER FARSI YEH -->
<range>
<int>0x06D2</int> <!-- ARABIC LETTER YEH BARREE -->
<int>0x06D3</int> <!-- ARABIC LETTER YEH BARREE WITH HAMZA ABOVE -->
</range>
</charset>
...
The OT spec says:
"When building a Unicode font for Windows, the platform ID should be 3 and the
encoding ID should be 1. When building a symbol font for Windows, the platform
ID should be 3 and the encoding ID should be 0."
We were ignoring the SYMBOL_CS entry before. It's UTF-16/UCS-2 like the
UNICODE_CS.
Also, always use UTF-16BE instead of UCS-2BE. The conversion was doing
UTF-16BE anyway.
Last night in between my dreams I also noticed that we support Unicode
values up to 0x01000000 and not 0x00100000 which I thought before.
This covers the entire Unicode range.
Protect cache against future expansions of FcLangSet (adding new
orth files). Previously, doing so could change the size of
that struct. Indeed, that happened between 2.6.0 and 2.7.3, causing
crashes. Unfortunately, sizeof(FcLangSet) was not checked in fcarch.c.
This changes FcLangSet code to be able to cope with struct size changes.
And change cache format, hence bumping from 2 to 3.
Before a NULL config was passed down adn essentially FcFileScan was
equivalent to FcFreeTypeQuery. Now fc-scan tool correctly applies
the configuration to the scanned patterns.
The East Asian double-byte codepages have characters with backslash as
the second byte, so we must use _mbsrchr() instead of strrchr() when
looking at pathnames in the system codepage.
Must not call FcStrFree() on a value returned by
FcStrBufDoneStatic(). In the Windows code don't bother with dynamic
allocation, just use a local buffer.
Fontconfig assigns an index number to each language it knows about.
The index is used to index a bit in FcLangSet language map. The bit
map is stored in the cache.
Previously fc-lang simply sorted the list of languages and assigned
them an index starting from zero. Net effect is that whenever new
orth files were added, all the FcLangSet info in the cache files would
become invalid. This was causing weird bugs like this one:
https://bugzilla.redhat.com/show_bug.cgi?id=490888
With this commit we fix the index assigned to each language. The index
will be based on the order the orth files are passed to fc-lang. As a
result all orth files are explicitly listed in Makefile.am now, and
new additions should be made to the end of the list. The list is made
to reflect the sorted list of orthographies from 2.6.0 released followed
by new additions since.
This fixes the stability problem. Needless to say, recreating caches
is necessary before any new orthography is recognized in existing fonts,
but at least the existing caches are still valid and don't cause bugs
like the above.
The format '%{[]family,familylang{expr}}' expands expr once for the first
value of family and familylang, then for the second, etc, until both lists
are exhausted.
The '%{=unparse}' format expands to the FcNameUnparse() result on the
pattern. Need to add '%{=verbose}' for FcPatternPrint() output but
need to change that function to output to a string first.
Also added the '%{=fclist}' and '%{=fcmatch}' which format like the
default format of fc-list and fc-match respectively.
The format '%{family|delete( )}' expands to family values with space removed.
The format '%{family|translate( ,-)}' expands to family values with space
replaced by dash. Multiple chars are supported, like tr(1).
The format '%{family|escape(\\ )}' expands to family values with space
escaped using backslash.
The format '%{family|downcase}' for example prints the lowercase of
the family element. Three converters are defined right now:
'downcase', 'basename', and 'dirname'.
The conditional '%{?elt1,elt2,!elt3{expr1}{expr2}}' will evaluate
expr1 if elt1 and elt2 exist in pattern and elt3 doesn't exist, and
expr2 otherwise. The '{expr2}' part is optional.
The filtering, '%{+elt1,elt2,elt3{subexpr}}' will evaluate subexpr
with a pattern only having the listed elements from the surrounding
pattern.
The deletion, '%{-elt1,elt2,elt3{subexpr}}' will evaluate subexpr
with a the surrounding pattern sans the listed elements.
Diego Santa Cruz pointed out that we are using that API wrongly.
The forth argument is a pointer to a pointer. Turns out we don't
need that arugment and it accepts NULL, so just pass that.
To only work on writable charsets. Also, return a bool indicating whether
the merge changed the charset.
Also changes the implementation of FcCharSetMerge and FcCharSetIsSubset
Previously an index j was added to element score to prefer matches earlier
in the value list to the later ones. This index started from 0, meaning
that the score zero could be generated for the first element. By starting
j from one, scores for when the element exists in both pattern and font
can never be zero. The score zero is reserved for when the element is
NOT available in both font and pattern. We will use this property later.
This shouldn't change matching much. The only difference I can think of
is that if a font family exists both as a bitmap font and a scalable
version, and when requesting it at the size of the bitmap version,
previously the font returned was nondeterministic. Now the scalable
version will always be preferred.
Previously the matcher multiplied comparison results by 100 and added
index value to it. With long lists of families (lots of aliases),
reaching 100 is not that hard. That could result in a non-match early
in the list to be preferred over a match late in the list. Changing
the multiplier from 100 to 1000 should fix that.
To keep things relatively in order, the lang multiplier is changed
from 1000 to 10000.
Previously fc-match "xxx,nazli" matched Nazli, but "xxx, nazli" didn't.
This was because of a bug in FcCompareFamily's short-circuit check
that forgot to ignore spaces.
I can't understand why the special case is needed. Indeed, removing it
does not make any difference in the "fc-match --verbose" output, and
that's the only time fc-match uses FcPatternPrint.
Two changes:
- after mkdir(), we immediately chmod(), such that we are not affected
by stupid umask's.
- if a directory we want to use is not writable but exists, we try a
chmod on it. This is to recover from stupid umask's having affected
us with older versions.
The current behaviour of FcSortWalk() is to create a new FcCharSet on
each iteration that is the union of the previous iteration with the next
FcCharSet in the font set. This causes the existing FcCharSet to be
reproduced in its entirety and then allocates fresh leaves for the new
FcCharSet. In essence the number of allocations is quadratic wrt the
number of fonts required.
By introducing a new method for merging a new FcCharSet with an existing
one we can change the behaviour to be effectively linear with the number
of fonts - allocating no more leaves than necessary to cover all the
fonts in the set.
For example, profiling 'gedit UTF-8-demo.txt'
Allocator nAllocs nBytes
Before:
FcCharSetFindLeafCreate 62886 2012352
FcCharSetPutLeaf 9361 11441108
After:
FcCharSetFindLeafCreate 1940 62080
FcCharSetPutLeaf 281 190336
The savings are even more significant for applications like firefox-3.0b5
which need to switch between large number of fonts.
Before:
FcCharSetFindLeafCreate 4461192 142758144
FcCharSetPutLeaf 1124536 451574172
After:
FcCharSetFindLeafCreate 80359 2571488
FcCharSetPutLeaf 18940 9720522
Out of interest, the next most frequent allocations are
FcPatternObjectAddWithBinding 526029 10520580
tt_face_load_eblc 42103 2529892
Note that this also fixes a bug with FcFontList() where previously
it was NOT checking whether the config is up-to-date. May want to
keep the old behavior and document that ScanInterval is essentially
unused internally (FcFontSetList uses it, but we can remove that
too).
A private FcObjectGetSet() is implemented that provides an
FcObjectSet of all registered elements. FcFontSetList() is
then modified to use the object set from FcObjectGetSet() if
provided object-set is NULL.
Alternatively FcObjectGetSet() can be made public. In that
case fc-list can use that as a base if --verbose is included,
and also add any elements provided by the user (though that has
no effect, as all elements from the cache are already registered).
Currently fc-list ignores user-provided elements if --verbose
is specified.
The fact that we now drop final slashes from all filenames without
checking that the file name represents a directory may surprise some,
but it doesn't bother me really.
At OLPC, we came across a bug where the Browse activity (based on xulrunner)
took 100% CPU after an upgrade/. It turns out the Mozilla uses
FcConfigUptoDate() to check if new fonts have been added to the system, and
this function was always returning FcFalse since we have the mtimes of some
font directories set in the future. The attached patch makes
FcConfigUptoDate() print a warning and return FcTrue if mtime of directories
are in the future.
It seems indices in _FcMatchers array are slightly mixed up, MATCH_DECORATIVE
should be 10, not 11.
And MATCH_RASTERIZER_INDEX should be 13, not 12, right?
Libtool-2.2 introduces new restrictions. So now it does not allow LT_*
variables as it includes marcros:
m4_pattern_forbid([^_?LT_[A-Z_]+$])
Rename the LT_ variables to LIBT_ to work around this restriction.
Building 2.5.91 on Solaris with the native make(1) yields
...
Making all in src
make: Fatal error in reader: Makefile, line 313: Unexpected end of line seen
Current working directory /tmp/fontconfig-2.5.91/src
*** Error code 1
This is due to the following line (src/Makefile.am:143):
CLEANFILES := $(ALIAS_FILES)
Changing that to a standard assignment ("=") fixes the problem.
I believe the ":=" is a typo. ALIAS_FILES is just a statically assigned
variable; it's not like evaluating it more than once would be a problem.
If the /usr/bin/head program is missing or unusable, or if an unusable head
program is listed first in the PATH, fontconfig fails to build
using "sed -n 1p" instead of "head -1" would be a suitable workaround.
Since fontconfig didn't have special handling for paths in static Windows
libraries, I've created a patch which should fix this.
Basically it does this:
fccfg.c:
If fontconfig_path was uninitialised it tries to get the directory the exe is
in and uses a fonts/ dir inside that.
fcxml.c:
In case the fonts.conf lists a <dir>CUSTOMFONTDIR</dir>, it searches for a
fonts/ directory where the exe is located.
David Turner has modified FreeType to be able to render sub-pixel decimated
glyphs using different methods of filtering. Fontconfig needs new
configurables to support selecting these new filtering options. A patch
follows that would correspond to one available for Cairo in bug 10301.
Bitmap-only TrueType fonts without a glyf table will not load a glyph when
FT_LOAD_NO_SCALE is set. Work around this by identifying TrueType fonts that have no
glyphs and select a single strike to measure the glyph map with.