Description
Background
Apple has restricted access to non-system local fonts in WebKit (this includes both font names in font-family
as well as the local()
function in @font-face
) and does not wish to go back on this. While this is effective for curbing font-based fingerprinting (see #4055), it has raised several concerns:
- i18n: Users of certain languages are dependent on local fonts to access website content in their language. These fonts are too large to be included as web fonts.
- This unfairly privileges certain scripts with few characters and effectively restricts web design in certain locales to system fonts, as web fonts are not a tractable solution for many languages, including several East Asian languages. Double-keyed caching has made that even less of a viable option.
- In many cases, web fonts are not a valid solution due to font licensing (if I have a legal copy of Adobe Caslon Pro, I should be able to view content using it without the website author or me needing a web font permitting license)
- Even for cases where web fonts are a viable solution, the carbon footprint of all these pointless repeated downloads is not negligible and in many developing countries bandwidth is very expensive, so this also privileges Western locales.
To sum up, the current solution goes against the following Ethical Web Principles:
I’m not suggesting we should just accept font-based fingerprinting, but I'm hoping we can find a better balance of tradeoffs than throwing out the baby with the bathwater.
Proposal: Limiting local font access per origin
What if, instead of entirely cutting off access to local fonts we only allow N fonts per origin (with a reasonably small N, e.g. 8)? It seems that for most use cases this would be sufficient, and still minimize or even eliminate font-based fingerprinting.
It is often argued that for certain fonts (the ones necessary for the i18n cases) simply the presence of a single font can narrow down the user quite a lot. But it seems to me that in that case, the user's locale and HTTP headers would narrow things down just as much.
Should misses count towards the limit?
Font access would only "register" when the font is actually used, e.g. if I specify a font stack like Adobe Caslon Pro, Adobe Garamond, Hoefler Text, serif;
this doesn't use up 3 fonts from that origin, but only one (the one actually applied).
It could be argued that then we have more bits of entropy, because if we can detect that it's Hoefler Text that is being applied we also know that Adobe Caslon Pro and Adobe Garamond are not installed. But given that the set of fonts that exist is huge, and most fingerprinting depends on which fonts you do have, it may be an acceptable tradeoff. If misses also count towards the limit, we'd need a larger limit (around 3x larger I'd suggest).
System fonts would not count towards the limit since they do not add bits of entropy anyway. Also, this would count families, not faces. I.e. using ten weights from a font does not use up 10 fonts from the limit. This is important for websites to display properly, and only uses up a small bit of additional entropy.
Who manages this data and for how long?
The UA would manage which local fonts have been accessed for each origin, and this would expire after a certain period (a month? a year? more research needed about what's the shortest period that still curbs fingerprinting) so that websites don't have to be locked in to their
original font choices until the end of time. It is also cleared when the user clears local data.
What about incognito mode? Same as other private data: the browser can still enforce the limit, and just delete the data after the session.
Contexts allowed to obtain local font access
What about iframes? With a naive implementation of this proposal, fingerprinting could just be split across multiple collaborating origins. There could even be a fingerprinting-as-a-service SaaS: a service that sets up hundreds of origins, each detecting 8 additional fonts, and combines the results to identify users. Websites would then just include a page from this company in an iframe, and that would in turn iframe all other origins.
To prevent this scenario, I propose differentiating when an origin can obtain access to a local font and when it can just use local fonts already accessed, without accessing new ones. A website would need to be accessed at the top-level to obtain access to a local font. Iframes can only use local fonts the origin has already obtained access to. We may want to restrict this further, making additional contexts use-only (canvas? dynamically inserted CSS?).
Metadata
Metadata
Assignees
Labels
Type
Projects
Status