Privacybadger: Understanding the user data json file

Created on 18 Nov 2020  路  12Comments  路  Source: EFForg/privacybadger

Hello!

I am a student working with privacy badger user downloadable data for a project. I needed to understand what the json file represents and couldn't find any information online. Could someone explain what action_map, settings_map, and snitch_map store?

documentation & specs question

All 12 comments

Hello!

I just updated our design document to explain action_map and snitch_map (under Further Details).

As for settings_map, it contains various Privacy Badger settings, both internal (seenComic stores whether the user interacted with the new user welcome page already) and user-facing (typically available on the options page).

Let me know if you have any questions.

Hi! Thank you for the update! I had a quick question, is the info stored in the snitch_map based on the user's activity? It seems to store data about websites that I have never visited. For example, the domain value that occurs the most is informationweek.com, a website I haven't visited.

Privacy Badger includes pre-trained tracker data gathered by Badger Sett. If you enable learning to block trackers from your browsing in options, your Privacy Badger will then build on that pre-trained data as you browse the Web.

You can clear the pre-trained data under the Manage Data tab on the options page.

By the way, there are a few problems with using a stock Privacy Badger to answer questions like which websites have the most trackers.

For example, trackers often load other trackers dynamically. When a tracker that brings in other trackers is blocked, Privacy Badger never sees the trackers that would have been loaded otherwise.

Moreover, snitch_map records up to three sites per tracker only.

So if you want to use Privacy Badger to figure out which sites have the most trackers, which trackers appear on the most sites, etc., you should use a modified Privacy Badger, one that doesn't block requests (nor modify them in any way?), and one that doesn't cap snitch_map entries to three sites.

That is really helpful, thank you! I have been looking at the code to figure out how to remove the three sites cap on snitch_map entries, but couldn't really find how to do that. Do you have any suggestions?
Also, if blocking is entirely disabled, do you think the dynamically loaded trackers problem be solved?

I think your best bet is to set TRACKING_THRESHOLD to a high number so that Privacy Badger effectively never decides to block anything. This should take care of both snitch_map (capped to TRACKING_THRESHOLD), and no longer blocking ("red" slider) nor modifying any requests/denying JS storage access ("yellow" slider).

Don't forget to enable local learning (under General Settings > Advanced) and clear all pre-trained data (under the Manage Data tab).

You may also want to disable sending "Global Privacy Control" and "Do Not Track" signals (under General Setttings), as some websites do respect them and serve fewer or no trackers in response.

Gotcha! Thank you so much. From there, the process to install this modified version should just be to create an xpi file using the manifest.json? (I have no experience with extensions haha)
EDIT: nevermind, i found the develop doc! Thanks a lot for your help!

Hey! So it seems like raising the threshold doesn't actually do anything. PB still blocks trackers after 3 spots. I commented out the code from line 331-335 in heuristicblocking.js and nothing changed. Do you have any more pointers?

Did you update TRACKING_THRESHOLD to let's say 3000000 and then reload the unpacked extension? You have to reload the extension after every code change you make, or your changes do not get applied.

I did. I am using Chrome, and I load the src everytime I change something. I can try Firefox to see if that is any different.

The browser shouldn't make a difference. I think there is something off with your workflow. Did you clear the pre-trained data, for example?

If you're still having trouble, post the exact steps you followed and we could go from there. For example:

  1. git clone https://github.com/EFForg/privacybadger.git
  2. Edited constants.js to set TRACKING_THRESHOLD to 300
  3. Loaded the unpacked extension into Chrome
  4. Clicked "Remove all" under Manage Data on the options page
    ...

This is probably the silliest mistake I have made. I had a packed release of PB installed on Chrome, which was overriding the unpacked one I installed. It works as intended now. Thank you, and sorry for the confusion!!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

tophtophtoph64 picture tophtophtoph64  路  4Comments

ghost picture ghost  路  5Comments

jawz101 picture jawz101  路  4Comments

urfausto picture urfausto  路  4Comments

iammyr picture iammyr  路  4Comments