Understanding and debugging fontconfig font fallback on macOS

Recently, I used an app on macOS that uses fontconfig. While this filled me with an appropriate sense of dread, I encountered an issue which forced me into debugging its font selection process.

Unfortunately, the app could not display certain symbols I was using (⚭, ⚯) even though my system had fonts providing those glyphs. Using FontBook.app, I found out that the two glyphs are contained in Apple Symbols.

In order to debug the issue, I used fc-match as follows.

$ fc-match -s :charset=26ad # ⚭
LastResort.otf: ".LastResort" "Regular"
$ fc-match -s :charset=26af # ⚯
LastResort.otf: ".LastResort" "Regular"

This result shows us that fontconfig doesn't find a font for the requested glyphs and returns the LastResort font which instead provides replacement glyphs with the Unicode character code inside them.

Knowing that a font was actually available for those glyphs, I started debugging the issue:

$ FC_DEBUG=4 fc-match -s :charset=26ad
[…]
FcConfigSubstitute donePattern has 5 elts (size 16)
        family: "Arial"(s) […]
        hintstyle: 1(i)(w)
        charset:
        0026: 00000000 00000000 00000000 00000000 00000000 00002000 00000000 00000000(s)
        lang: "en"(w)
        prgname: "fc-match"(s)

[…]

The FC_DEBUG variable selects »match/test/edit execution« debug output.

This shows the process by which fontconfig builds up the pattern that it uses to search for a font. The string donePattern indicates the final step which is used to find a font. The output shows the list of font families that are considered and the remaining criteria, specifically the hintstyle, the required charset and the language.

Digging deeper, I wanted to find out what information fontconfig has about the Apple Symbols font.

$ FC_DEBUG=2 fc-match -s :charset=26ad
[…]
Font 7 Pattern has 22 elts (size 22)
        family: "Apple Symbols"(w)
        familylang: "en"(w)
        […]
        charset:
        […]
        0026: ffffffff ffffffff ffffffff ffffffff 1fffffff 0007ffff 00000000 00000000
        […]
        lang: el|fj|ho|ia|ie|io|nr|om|sm|so|ss|st|sw|to|ts|uz|xh|zu|kj|kwm|ms|ng|rn|rw|sn|za(w)
[…]

Here, the FC_DEBUG setting requests »extensive font matching information«.

This snippet shows that the font does indeed contain the requested glyph. However, for some reason fontconfig has not assigned English (en) to the list of languages with which Apple Symbols can be used. I assume that this is the result of some algorithm based on the characters contained in the font. According to FontBook.app, the font itself definitely claims to support English, among many other languages:

Asu, Bemba(bem), Bena(bez), Chiga, Cornish(kw), English(en), Greek(el), Gusii(guz), Indonesian(id), Kalenjin(kln), Kinyarwanda(rw), Luo(luo), Luyia(luy), Machame, Makhuwa-Meetto(mgh), Makonde(kde), Malay(ms), Morisyen, North Ndebele(nd), Nyankole(nyn), Oromo(om), Rombo, Rundi(rn), Rwa, Samburu(saq), Sangu, Shambala(ksb), Shona(sn), Soga(xog), Somali(so), Swahili(sw), Taita(dav), Teso(teo), Uzbek(uz), Vunjo(vun), Zulu(zu)

In order to fix this problem, I added a section to my ~/.config/fontconfig/fonts.conf which assigns the correct language codes to the fontconfig cache for the Apple Symbols font.

<?xml version="1.0"?>
<!DOCTYPE fontconfig SYSTEM "fonts.dtd">
<fontconfig>
    <!-- Apple Symbols loses some language tags and just ends up with
         el, fj, ho, ia, ie, io, nr, om, sm, so, ss, st, sw, to, ts, uz,
         xh, zu, kj, kwm, ms, ng, rn, rw, sn, za.

         Full list taken from FontBook.app:

         Asu, Bemba(bem), Bena(bez), Chiga, Cornish(kw), English(en),
         Greek(el), Gusii(guz), Indonesian(id), Kalenjin(kln),
         Kinyarwanda(rw), Luo(luo), Luyia(luy), Machame,
         Makhuwa-Meetto(mgh), Makonde(kde), Malay(ms), Morisyen, North
         Ndebele(nd), Nyankole(nyn), Oromo(om), Rombo, Rundi(rn), Rwa,
         Samburu(saq), Sangu, Shambala(ksb), Shona(sn), Soga(xog),
         Somali(so), Swahili(sw), Taita(dav), Teso(teo), Uzbek(uz),
         Vunjo(vun), Zulu(zu)
     -->
    <match target="scan">
        <test compare="eq" name="family">
            <string>Apple Symbols</string>
        </test>
        <edit binding="same" mode="assign_replace" name="lang">
            <langset>
                <string>bem</string>
                <string>bez</string>
                <string>dav</string>
                <string>el</string>
                <string>en</string>
                <string>fj</string>
                <string>guz</string>
                <string>ho</string>
                <string>ia</string>
                <string>id</string>
                <string>ie</string>
                <string>io</string>
                <string>kde</string>
                <string>kj</string>
                <string>kln</string>
                <string>ksb</string>
                <string>kw</string>
                <string>kwm</string>
                <string>luo</string>
                <string>luy</string>
                <string>mgh</string>
                <string>ms</string>
                <string>nd</string>
                <string>ng</string>
                <string>nr</string>
                <string>nyn</string>
                <string>om</string>
                <string>rn</string>
                <string>rw</string>
                <string>saq</string>
                <string>sm</string>
                <string>sn</string>
                <string>so</string>
                <string>ss</string>
                <string>st</string>
                <string>sw</string>
                <string>teo</string>
                <string>to</string>
                <string>ts</string>
                <string>uz</string>
                <string>vun</string>
                <string>xh</string>
                <string>xog</string>
                <string>za</string>
                <string>zu</string>
            </langset>
        </edit>
    </match>
</fontconfig>

Note that properties of type langset do not support appending, so the full resulting set needs to be specified. Also, the replacement needs to take place during »scanning« when the cache is built in order to affect the information stored for the font. As a consequence, the fontconfig cache must be force-rebuilt because fc-cache doesn't consider changes to the configuration when it decides whether to rebuild a cache.

$ fc-cache -f

Now, fc-match should give the following output:

$ fc-match -s :charset=26ad
Apple Symbols.ttf: "Apple Symbols" "Regular"
LastResort.otf: ".LastResort" "Regular"

Additionally, I also added the following section to my fonts.conf which adds Apple Symbols as a fallback font for the sans-serif family. This slightly increases the likelihood of it being picked. The fact that it was already picked up beforehand shows that when it is really desparate to find a glyph, fontconfig considers all the fonts it knows about and not just ones with a matching family.

    <!-- Consider glyphs from Apple Symbols, e.g. U+26ad, U+26af. -->
    <alias>
        <family>sans-serif</family>
        <default>
            <family>Apple Symbols</family>
        </default>
    </alias>

In the end, I discovered that my app still did not work because it turned out that it wasn't actually using Pango Cairo's fontconfig backend but the CoreText backend instead. After changing this using PANGOCAIRO_BACKEND=fontconfig I was now able to display the glyphs I needed.