Consider adding The Quantum Alpha ad list. That list is created using an AI web crawler, and it seems more extensive that any ad list I've seen so far.
On the down side, however, is that since this list created by AI, it could contain some false positives.
Hello! Thank you for opening your first issue in this repo. It’s people like you who make these host files better!
I would like to know what 'AI' logic is happening here. It seems to me that this project is another large list-of-lists except without any credit to the original list creators:
https://blocklist-tools.developerdan.com/entries/search?q=block-test.developerdan.com
Hard to say for sure without a lot of history yet which lists are being used. They even have a list that claims to block YouTube ads, which I'm fairly confident isn't possible to do #1017

'EnergizedProtection' is a list-of-lists project, as well as the OISD list. I believe 'blackbook' is a legitimate source list as far as hosts lists go, but it does source from various services like URLhaus. Since that domain showed up in that list first, and then the other lists the following day - it makes sense that the "The Quantum Alpha" list is including one of those other lists as a source. Looking at the diff from blackbook and then the diff from The Quantum Alpha, you can see that the same domains were added (along with others). The one exception is bohler-edelstahl-at[dot]com - which was already in the list (perhaps directly from the URLhaus: Malicious URL blocklist project).
Again, without more history its hard to say for sure what lists are being used, but its pretty clear that this list is using other people's work without giving them credit, and violating their licenses. Which isn't really a surprise, no one just creates a 800,000 domain list in a matter of weeks. It appears the 'AI' in this case, is just a way to take credit for other people's hard work.
Ticket for tracking: https://gitlab.com/The_Quantum_Alpha/the-quantum-ad-list/-/issues/13
Oh. This is really sad to hear, considering that this project made it to the top page of HN today.
I would really hate it if this project was using other people's work without accrediting them.
Just for fun...
$ ./ghosts -c https://gitlab.com/The_Quantum_Alpha/the-quantum-ad-list/-/raw/master/For%20hosts%20file/The_Quantum_Ad-List.txt
----------------------------------------
Base hosts file summary:
----------------------------------------
Location: https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts
Domains: 58,787
Bytes: 1.8 MB
----------------------------------------
----------------------------------------
Compared hosts file summary:
----------------------------------------
Location: https://gitlab.com/The_Quantum_Alpha/the-quantum-ad-list/-/raw/master/For%20hosts%20file/The_Quantum_Ad-List.txt
Domains: 789,804
Bytes: 23 MB
----------------------------------------
Intersection: 58,751 domains
This is just stupid. We are most definitely not going to add this. Simple reason: there's no way anyone can curate 789,804 domains. We do things differently here. We review every diff from all our sources for each release.
There is no way adding 731k domains to our 58k list makes us a better list.
Here's more. Comparing TLD here, our list first (truncating at 50 tally) compared to the suggested list (truncated to 500 tally).
$ ./ghosts -tld
----------------------------------------
Base hosts file summary:
----------------------------------------
Location: https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts
Domains: 58,787
Bytes: 1.8 MB
TLD tally:
com: 30,483
net: 7,116
pl: 5,989
eu: 1,204
live: 1,068
info: 843
jp: 834
org: 809
vn: 802
ru: 734
xyz: 513
de: 464
io: 392
uk: 358
nl: 323
cn: 310
at: 301
co: 297
online: 295
fr: 256
site: 256
biz: 229
in: 199
us: 194
me: 172
club: 171
tv: 164
mobi: 124
tk: 124
br: 123
it: 119
name: 114
top: 112
pro: 112
cz: 105
ca: 104
cc: 98
space: 84
pw: 83
be: 77
icu: 73
life: 71
ro: 68
ir: 66
hu: 64
asia: 62
es: 58
su: 56
website: 54
link: 53
fun: 52
$ ./ghosts -tld -m https://gitlab.com/The_Quantum_Alpha/the-quantum-ad-list/-/raw/master/For%20hosts%20file/The_Quantum_Ad-List.txt
----------------------------------------
Base hosts file summary:
----------------------------------------
Location: https://gitlab.com/The_Quantum_Alpha/the-quantum-ad-list/-/raw/master/For%20hosts%20file/The_Quantum_Ad-List.txt
Domains: 789,804
Bytes: 23 MB
TLD tally:
com: 356,741
net: 63,535
org: 37,830
stream: 27,210
ru: 19,258
tk: 14,436
pl: 13,898
icu: 13,503
info: 12,100
top: 11,720
br: 8,549
de: 8,226
xyz: 7,124
cn: 7,119
review: 7,009
uk: 6,795
bid: 6,554
download: 6,144
win: 5,776
cc: 5,017
us: 4,945
in: 4,590
club: 4,537
io: 4,313
pw: 4,226
eu: 3,948
fr: 3,825
jp: 3,703
au: 3,609
nl: 3,598
it: 3,411
co: 3,300
loan: 3,140
site: 2,989
biz: 2,930
vn: 2,661
gdn: 2,639
ca: 2,552
hu: 2,530
online: 2,512
live: 2,502
ml: 2,479
es: 2,048
trade: 2,038
me: 1,990
pro: 1,981
cf: 1,860
ga: 1,848
ua: 1,808
ir: 1,774
website: 1,733
tv: 1,692
ro: 1,689
id: 1,683
date: 1,679
za: 1,528
cz: 1,465
ltd: 1,454
cl: 1,397
ch: 1,383
at: 1,371
se: 1,370
be: 1,338
pt: 1,304
gq: 1,223
space: 1,208
mx: 1,200
tech: 1,087
tr: 1,078
ar: 1,058
dk: 1,042
kr: 1,004
mn: 912
su: 753
science: 750
ws: 740
life: 693
gr: 673
my: 634
fun: 581
xn--p1ai: 565
sk: 541
tw: 540
il: 527
pk: 524
mobi: 504
I've always thought we are under-listing .cn and .ru TLD, and that's a clear weakness.
But 27k .stream domains, really? More than .cn and .ru combined? How is this legit?
Not buying it.
This is just a stupid list. Take coinbase for example. I'm not a fan of Coinbase, but let's see.
$ cat ~/temp/z.txt | grep coinbase
0.0.0.0 airdrop-coinbase.com
0.0.0.0 api.coinbase.com <---------
0.0.0.0 api.exchange.coinbase.com
0.0.0.0 api.sandbox.coinbase.com
0.0.0.0 assets.coinbase.com
0.0.0.0 beta.coinbase.com
0.0.0.0 bittip.coinbase.com
0.0.0.0 buy.coinbase.com
0.0.0.0 coinbase-ca.com
0.0.0.0 coinbase-promo.info
0.0.0.0 coinbase-us1.info
0.0.0.0 coinbase.aa-gg.com
0.0.0.0 coinbase.com <-------
0.0.0.0 coinbase.com.eslogin.co
0.0.0.0 coinbase.gift
0.0.0.0 coinbaseboggether.tumblr.com
0.0.0.0 coinbasenews.co.uk
0.0.0.0 coinbasepro-giveaway.com
0.0.0.0 coinbasepromo.com
0.0.0.0 coinbasespromo.tumblr.com
0.0.0.0 coinbasewin.com
0.0.0.0 community.coinbase.com
0.0.0.0 custody.coinbase.com
0.0.0.0 developers.coinbase.com
0.0.0.0 docs.exchange.coinbase.com
0.0.0.0 eio-feed.exchange.coinbase.com
0.0.0.0 engineering.coinbase.com
0.0.0.0 ent-api.sandbox.coinbase.com
0.0.0.0 ex-notify.coinbase.com
0.0.0.0 exceptions.coinbase.com
0.0.0.0 exchange.coinbase.com
0.0.0.0 feed.exchange.coinbase.com
0.0.0.0 filetransfer.coinbase.com
0.0.0.0 fix.exchange.coinbase.com
0.0.0.0 icoinbase.com
0.0.0.0 images.coinbase.com
0.0.0.0 login.coinbase.com
0.0.0.0 promo-coinbase.com
0.0.0.0 public.sandbox.exchange.coinbase.com
0.0.0.0 sandbox.coinbase.com
0.0.0.0 sandbox.exchange.coinbase.com
0.0.0.0 staging.community.coinbase.com
0.0.0.0 status.coinbase.com
0.0.0.0 store.coinbase.com
0.0.0.0 support.coinbase.com
0.0.0.0 ws-feed.exchange.coinbase.com
0.0.0.0 ws.coinbase.com
0.0.0.0 ws.sandbox.coinbase.com
0.0.0.0 www.api.sandbox.coinbase.com
0.0.0.0 www.beta.coinbase.com
0.0.0.0 www.bittip.coinbase.com
0.0.0.0 www.blog.coinbase.com
0.0.0.0 www.coinbase-drop.com
0.0.0.0 www.coinbase-gift.com
0.0.0.0 www.coinbase.com <---------
0.0.0.0 www.community.coinbase.com
0.0.0.0 www.custody.coinbase.com
0.0.0.0 www.engineering.coinbase.com
0.0.0.0 www.ent-api.sandbox.coinbase.com
0.0.0.0 www.ex-notify.coinbase.com
0.0.0.0 www.filetransfer.coinbase.com
0.0.0.0 www.login.coinbase.com
0.0.0.0 www.public.sandbox.exchange.coinbase.com
0.0.0.0 www.sandbox.coinbase.com
0.0.0.0 www.sandbox.exchange.coinbase.com
0.0.0.0 www.store.coinbase.com
0.0.0.0 www.ws.sandbox.coinbase.com
Fantastic point of reference. Those domains where in a single version of the 'CoinBlockerLists' - and a single version of your unified list. Somehow it got picked up by the poor quality 'Block List Project: Crypto' list. And now they are also showing up in this Quantum list. I think that makes the 'Block List Project' as a highly likely source for this Quantum list, but who knows really.
Source: https://blocklist-tools.developerdan.com/entries/search?q=www.login.coinbase.com

Thanks Dan @lightswitch05 great sleuthing, there.
Closing. Thanks for the suggestion Max @MAX10541 but I think I'll pass.
Most helpful comment
Thanks Dan @lightswitch05 great sleuthing, there.
Closing. Thanks for the suggestion Max @MAX10541 but I think I'll pass.