User.js: Not An Issue; Spoof HTML headers without getting more unique

Created on 9 Aug 2019  路  9Comments  路  Source: arkenfox/user.js

As we know, in attempt to protect yourself from web tracking, it's very easy to go too far and get the opposite result, when your browser's fingerprint becomes even more unique. In my daily routine, I use modified Tor Browser, doing my best to keep its fingerprint unchanged to not stand out from the crowd. Unfortunately, my knowledge is not always enough.

Since one of the tracking methods in use by web companies is HTTP headers, there is a lot of resistance tools working on this principle, like the famous Privacy Possum or multipurpose Chameleon for example. But as a person far from web development, I can't stop wondering: Can websites detect these tools by recognizing the HTTP header manipulation on my side? And if that turns out to be true, are there any exceptions and is it possible to maintain a balance by spoofing only certain headers without risk of being discovered?

Most helpful comment

My wife asked me why I spoke so softly in the house. I said I was afraid Mark Zuckerberg was listening! She laughed. I laughed. Alexa laughed. Siri laughed.

All 9 comments

I assume you mean HTTP headers. Headers are a subset of fingerprinting vectors, and most of them don't tell much on their own.

The header most commonly spoofed is UA (user-agent), since it's a relatively high-entropy vector that gives away your browser + its version + your OS, and can even give away your CPU architecture (x86/x64). Spoofing that header to make sites believe you're using another browser and/or OS is only effective as long as sites don't try to figure out your browser and/or OS via one of the many other available fingerprinting vectors. A bunch of those vectors don't even require javascript, and you can't spoof most of them, so spoofing UA to defend against fingerprinting can be counter-productive.

To elaborate, spoofing UA can be useful when you want some particular site to behave as it would if you were using a different browser, but it's not a good defense against fingerprinting because your overall fingerprint is comprised of a lot more than just headers. Even when sites seem to behave as they would if you were using a different browser, that does not necessarily mean they wont find out what your actual browser/OS is, because fingerprinting is a passive technique and all the data they gather about you can eventually be analyzed by a human (or a smart AI) that can tell apart the spoofable vectors from the rest and figure out what parts of your fingerprint are total bullshit.

As an anti-fingerprinting measure, spoofing UA is generally fine as long as you don't try to pretend you're using another browser/OS. Spoofing version numbers only (which is what privacy.resistfingerprinting does) is a good measure because telling apart two different Firefox versions is a lot more difficult/ineffective than telling apart two different browsers.

Other headers are low-entropy by themselves, but can make you stand out if you have a weird combination of them (which is more likely to happen if you actively try to spoof headers). Some headers can be used for tracking/de-anonimizing on their own (like cookies, and cache-related headers) but that does not qualify as fingerprinting because the strategy instead is to store unique data in your computer that they can later use for those purposes. Fingerprinting is what adversaries resort to when such simpler tracking/de-anonimizing methods are not sufficient to them.

There are also headers like Referer and Origin that aren't as simple to exploit for de-anonimizing users, but can still leak potentially important information about your browsing habits that can compromise your privacy.

So... my answers to your questions:

Can websites detect these tools by recognizing the HTML header manipulation on my side?

It can be tricky for them to deduce which particular tools you're using, but it is entirely possible. Just deducing that you're spoofing certain values is not as tricky, though.

are there any exceptions and is it possible to maintain a balance by spoofing only certain headers without risk of being discovered?

There is no perfect security. You can stay away from electronics and hide 50 feet underground, but if someone is determined to spy on you, they will still have chances. The best you can do is to reduce those risks as much as possible, at the cost of your time/convenience.

Ultimately, risks are circumstantial. The problem with fingerprinting is you will never know how much you are actually exposing yourself to it, because it's a passive technique.

I hope that helps.

I assume you mean HTTP headers.

Oh yes, my mistake! Now corrected.

You certainly helped me when pointed on the affiliation of UA and Referer to the headers group. Previously, I thought those two are some sort of separate things, and so my question might seem too obvious (of course, spoofing UA or Referer affects the fingerprint). However, I wanted to talk about more non-trivial types or HTTP headers, like ETag, Origin, Via, Authorization and etc. Do all the header types that can be used for tracking also affect the browser's fingerprint while changed/spoofed on the user's side?

Do all the header types that can be used for tracking also affect the browser's fingerprint while changed/spoofed on the user's side?

TL;DR: Yes, to a greater or lesser extent they all do.

Your browser's fingerprint is comprised of every single bit of information servers can gather about it, and the maximum complexity of the samples is practically up to the servers to define. If straightforward data that can be gathered in simple one-time visits (like HTTP headers) is not enough for them, they can even start looking for patterns in your browsing habits by linking whatever constants they can in your requests. That's why I said if someone is determined to spy on you they always have chances (if they're resourceful enough). Our best bet in dealing with this nightmarish reality is to make things as impractical for our adversaries as we can, but we make things less practical for ourselves in the process.

Some headers may be worth tampering with regardless of this, but not because of the risk they pose as fingerprinting vectors, but because they can be very easily used for tracking and/or de-anonimizing (as previously stated), which means that raising fingerprinting entropy by removing them can be a reasonable trade-off (depending on your threat model and willingness to sacrifice their benefits). Those are cookies and some cache-related headers, namely:

Request headers for making conditional requests:

  • If-Modified-Since
  • If-None-Match

Corresponding response headers:

  • Last-Modified
  • ETag

Note that:

  • Cache-related headers (those 4 above) and cookies are not much of a tracking vector if you use first-party isolation and/or containers. However, they can be used to de-anonimize you.
  • There are other arguably more convenient ways to deal with the cache. Those headers are added as metadata to cached files, so they don't represent a risk as long as that metadata does not stay in your computer for long. You can use the browser's built-in features for wiping the cache, but if you wanted to get creative you could instead wipe that metadata externally (not saying you should). The same can be said of cookies, in fact there are even more ways to deal with cookies.

As for the other headers... well, there is generally no reason to touch response headers (to defend against fingerprinting, that is) because those are only seen by your browser (they don't leak info). Via and X-Forwarded-For can be used to try to fool servers by making it look like you're connecting through a proxy server, but it is not worthwhile IMO because human operators can still figure out that's what's going on, and then your only accomplishment is that you raised your FP entropy for nothing.

Authorization is kind of an outdated standard, but some servers may still rely on it. IIRC, unless you tamper with that, Firefox should be sending that header only when necessary though. I wouldn't touch it.

Referer: refer to section 1600 in this user.js
Origin: you might want to strip those, but be aware that shit will break.

Okay. It seems to be better to give up Privacy Possum if I want to keep the original Tor Browser fingerprint unchanged. But regarding your comment that striping Origin could break sites: there is one interesting tool called POOP designed to do this job gently. Do you think using it will make my browser looks more unique too?

Yep, stripping origin headers also changes your fingerprint, but not as prominently as spoofing the UA or other stuff, since exploiting that header for fingerprinting is somewhat less straightforward. It's still a trade-off.

Let's put it this way: many changes you make (even by just configuring Firefox with user.js or just via the options) affect your fingerprint in one way or another. Content blockers affect your fingerprint. Blocklists affect your fingerprint. Certain user scripts affect your fingerprint. HTTPS Everywhere? affects your fingerprint. Decentraleyes? affects your fingerprint. [You name it] most likely affects your fingerprint in one way or another. The question is how, and if it's worth it. When a change affects your fingerprint in a not-super-prominent way and in return protects your privacy in a meaningful way, it may be worth it. Or not. It's your call.

My only suggestion is that you should pick a strategy and stick with it. Don't mix and match different measures against fingerprinting and hope for the best - that's not how it works. If you want to follow this repo's strategy, follow this repo's strategy only. If you want to follow Tor Browser's strategy, follow their strategy only. If you want to follow someone else's strategy, follow their strategy only. When it comes to anti-fingerprinting, mixing strategies is probably the worst possible route, because that's the easiest way to end up with a needlessly weird fingerprint.

There are about eight main methods for anti-FPing

  • block it

    • e.g disable the API or parts of it

  • return a single plausible FP

    • plausible so there is little to no breakage or information paradoxes

    • and use the value of the largest actual set: i.e most users aren't even lying

  • bucketize: return one of a set of plausible FP's based on some other criteria: one per criteria

    • a good example of criteria here is your OS: e.g for audioLatency, font whitelists etc

  • randomize

    • this could be per request, per tab, per domain, per session, per origin

  • randomize within buckets (e.g. stay plausible per OS)
  • limitation

    • e.g some audio properties are not available unless a user gesture has happened

    • e.g ^^ same with canvas: no prompt for user gesture-less shit

  • heuristics

    • the possibilities are endless

    • e.g. openWPM is working on this: detection of common calls and how they're called, in what contexts

    • e.g you could limit the number of fonts per document: this is just an example. in reality it probably wouldn't work (I can think of some workarounds) - tor browser already tried this in the past and eventually removed it

  • block the scripts: blacklists, heuristics

I would actually like to see Tor Browser use more randomization: I think the only thing at the moment that uses any randomness is the security measure against timing attacks: time jitter

My wife asked me why I spoke so softly in the house. I said I was afraid Mark Zuckerberg was listening! She laughed. I laughed. Alexa laughed. Siri laughed.

If we're not done here, feel free to re-open

Was this page helpful?
0 / 5 - 0 ratings

Related issues

grauenwolfe picture grauenwolfe  路  7Comments

earthlng picture earthlng  路  4Comments

Thorin-Oakenpants picture Thorin-Oakenpants  路  4Comments

earthlng picture earthlng  路  6Comments

crssi picture crssi  路  4Comments