Csswg-drafts: [css-text-3] Privacy Review - fingerprintability of the dictionaries

Created on 16 Oct 2020  ·  7Comments  ·  Source: w3c/csswg-drafts

Hey everyone 👋 Togheter with @dharb we conducted a privacy review of the CSS Text Module level 3 and presented it on the last PING meeting (minutes).

Two issues that we noted were:

A. Amount of details left up to UA can help uniquely identify browser vendor and, possibly, even individual browser versions (this was noted in https://github.com/w3c/csswg-drafts/issues/5574). We had a brief discussion about this with the group and concluded that the concern is minor as ATM those details are still being revealed by the user agent string.

B. Website can detect installed dictionaries by e.g. testing for language-specific hyphenation. This is much more concerning assuming that users can have a unique combination or versions of dictionaries installed. That being said, we didn't have enough knowledge about how those dictionaries are installed to fully asses the risk, so we decided to follow up with some questions:

  1. Are browsers shipping with build-in dictionaries or are they using system dictionaries?
  2. Are browsers shipping with all dictionaries or maybe only a dictionary matching the browser language?
  3. Are those dictionaries used for anything else in the browsers today that's already known to make them detectable?

I realize that those questions are asking about individual implementations and not the spec, but we are trying to asses the risk in the wild. All help answering those will be much appreciated 🙇‍♂️

Closed as Question Answered Commenter Response Pending Testing Unnecessary css-text-3

All 7 comments

Maybe @jfkthame / @litherum / @kojiishi can each answer these?

Firefox currently ships a standard collection of dictionaries for all users. I think it's possible in theory for a user (maybe via an add-on) to add others but don't know if anyone is actually doing this. Also not sure if some Linux distros might be customizing what they include?

Two notes:

  1. I believe all the above questions are relevant to hyphenation dictionaries, but also spell checking / grammar checking facilities.
  2. There's a missing question about whether user action can affect the dictionaries (like if they can intentionally teach the system about a new rule)

@litherum Spell checking and grammar checking a) aren't part of css-text-3 and b) don't affect layout so can't be detected in the same way.

For Blink:

  1. Are browsers shipping with build-in dictionaries or are they using system dictionaries?

We are shipping with built-in dictionaries on Windows, Linux, and ChromeOS. We are using system dictionaries on Android and Mac.

  1. Are browsers shipping with all dictionaries or maybe only a dictionary matching the browser language?

All dictionaries.

  1. Are those dictionaries used for anything else in the browsers today that's already known to make them detectable?

They are only for hyphenation.

Regarding hyphenation in the macOS and iOS ports of WebKit:

  1. WebKit is a system framework so the distinction between a browser dictionary and a system dictionary isn’t really relevant. We use CoreFoundation, a shared system framework, to do hyphenation.
  2. I don’t understand the question. Our browser doesn’t ship with dictionaries. The whole system does.
  3. Not that I know of.

@kdzwinel @dharb Is there anything else you want from this thread, or can I close the issue?

Was this page helpful?
0 / 5 - 0 ratings