Could anything else could train PB to be more aggressive? Like if Firefox canvas protection warnings or first party isolation are triggered assign a higher weight to block that domain. Or use of certain web standards. Or if 3rd party scripts and fonts are used?
As is, Privacy Badger still feels like it takes a relaxed approach to blocking. For instance, I turned off all other tracking protection features in Firefox, disabled any ad/analytic blockers and opened my 80 or so bookmarks, then went to reddit and opened another 50 posted links to "prime the pump" of Privacy Badger.
*It would be nice to see total counts of reds/greens/yellows on the Tracking Domains tab
*It would be nice to see a hit count of each site's entry on the Tracking Domains tab
If you're going to stop recording non-tracking domains it's going to list less things to block. If disabling checking a web page against EFF's DNT policy is boosts performance, what does that actually do? Does it make some sort of network request to the EFF to check something? Can that be rolled into a local detection instead then?
There are a bunch of things going on in this issue, which is fine, but I suggest filing targeted follow-up issues, a separate issue for each specific suggestion, after our conversation here.
Tweaking the way our heuristic works to detect and prevent tracking more quickly: Good idea, and something we should work on once we get existing heuristics to a more stable place. For example, we seem to have trouble learning to block Google Analytics (#367), the most common third-party tracker. I would say tweaks and improvements will have to come after serious bug fixes.
Adding interesting statistics to the options page. Yes! Excellent idea.
Regarding #1795, no longer recording non-tracking domains will not change what gets shown in the popup nor the options page. Tracking Domains on the options page already doesn't list non-tracking domains. The popup will continue displaying what it displays now the way it displays it now.
Checking if domains comply with EFF's Do Not Track policy makes requests to check for presence of /.well-known/dnt-policy.txt. For more information, see EFF's Do Not Track (DNT) Policy guide.
Making these requests comes with overhead. We worked and will continue working on reducing this overhead. For example, #1795 will help by no longer issuing these requests to non-tracking domains.
For what it's worth I just did a few tests comparing Firefox with different settings & extensions.
loaded 25 sites at least 5 times each and cleared cache between tests.
Privacy Badger with a trained profile:

Tracking Protection (basic list) built-in Firefox feature

Tracking Protection (strict list) built-in Firefox feature + disable dns prefetching=true, network.predictor.enabled=false

uBlock Origin with a few changes of which lists to use and cosmetic filtering disabled

uBlock setup same as above but with the "medium mode" and I unbroke a few sites as I went. Medium mode is disabling 3rd party scripts and frames- so I had to go back and allow a few common things to get media to show up.

tested with 25 sites I knew or guessed would be turds. Notice the disconnects in the last shot. The only thing in Firefox Lightbeam that linked a couple sites were ads.twitter.com and trbas.com (LA Times and Chicago Tribune won't display images w/o trbas.com)
http://www.androidpolice.com/
https://www.aol.com/
http://www.avsforum.com/
http://www.chicagotribune.com/
http://www.cnn.com/
https://www.merriam-webster.com/
http://www.foxnews.com/
https://www.huffingtonpost.com/
http://www.imdb.com/
https://lifehacker.com/
http://www.latimes.com/
https://www.msn.com/
https://www.theguardian.com/us
https://www.pcworld.com/
https://www.cnet.com/
https://www.bible.com/
https://www.snopes.com/
https://sourceforge.net/
https://www.nytimes.com/
http://time.com/
http://www.tmz.com/
http://www.tomshardware.com/
https://www.usatoday.com/
https://www.vice.com/en_us
https://www.washingtonpost.com/
@ghostwords @bcyphers
I don't want to muddy up your https://github.com/EFForg/privacybadger/issues/2114 but I wanted to run a couple of utilities by you and a question.
Question 1st- would it be possible or meaningful to factor in SSL certificate information? I've been curious if some CA's are more malware friendly than others or if, say, one domain gets blocked, all future domains belonging to the same organization are assigned a higher weight in the heuristic.
If anything, it's just my general curiousity to see if SSL certs reveal anything about the sorts of tracking companies. Since most SSL certs come with a cost, I would think they don't invest much money in separate certs for each of their domains/subdomains so it might be a way to establish equivalency amongst domains.
oh, and the utilities. Have you seen PyFunceble and OpenWPM?
https://funilrys.github.io/PyFunceble/
https://github.com/funilrys/PyFunceble
https://webtap.princeton.edu
https://github.com/citp/OpenWPM
Thanks for the pointers as always! I opened https://github.com/EFForg/badger-sett/issues/21 to investigate using PyFunceble as an easy way to speed up our crawler. We are fans of OpenWPM and the research papers it helps produce.
SSL certs: Looks like there is a tlsInfo webRequest-extending API coming up in Firefox 62 (and Chrome?) that lets WebExtensions inspect certificate details. So we could see what useful/interesting information we could get from certificates at some point. My feeling is there is plenty of lower-hanging fruit elsewhere in terms of improving our detection techniques, but I dunno, it's worth checking out. Feel free to open a new issue!
lol I just like bending your ear when I have these brain farts. If there's a mailing list I'd be glad to throw my random thoughts in there :)
Most helpful comment
For what it's worth I just did a few tests comparing Firefox with different settings & extensions.
loaded 25 sites at least 5 times each and cleared cache between tests.
Privacy Badger with a trained profile:

Tracking Protection (basic list) built-in Firefox feature

Tracking Protection (strict list) built-in Firefox feature + disable dns prefetching=true, network.predictor.enabled=false

uBlock Origin with a few changes of which lists to use and cosmetic filtering disabled

uBlock setup same as above but with the "medium mode" and I unbroke a few sites as I went. Medium mode is disabling 3rd party scripts and frames- so I had to go back and allow a few common things to get media to show up.
