AdGuardHome does not apply rules from popular filter lists (EasyList, EasyPrivacy)

Created on 2 Jul 2019  路  10Comments  路  Source: AdguardTeam/AdGuardHome


Steps to reproduce

  1. Add: https://easylist.to/easylist/easylist.txt
  2. Add: https://easylist.to/easylist/easyprivacy.txt

Expected behavior

Rules should be processed and domains should be blocked.

Actual behavior

No rules in these files are processed as they mostly have a $third-party suffix. I'm assuming that AdGuardHome skips over these intentionally, but the user is given no notice of this. In-fact, it's quite misleading as a 'rule count' is displayed (number of lines as opposed to actually processed rules) which leads the user to believe that entries have been found and will be processed accordingly.

Extra

I'm aware that your 'AdGuard Simplified Domain Names filter' may well include the filters from these files, but is there no way for AdGuardHome to be able to process these individually? I know that we don't have the flexibility to determine whether the requests are from third-party or not etc, but is it not safe to assume that even if they are listed as third-party, they could be blocked in the same way as ||test.com^?

Your environment

| Description | Value |
| -------------- | ------------ |
| Version of AdGuard Home server:| 0.96-hotfix
| How did you setup DNS configuration:| (Router)
| If it's a router or IoT, please write device model:| Raspberry Pi 3b
| Operating system and version:| Raspbian Buster

Medium feature request

Most helpful comment

Guys, if you're trying to convert adblock rules to the AG format, this might be useful:
https://github.com/AdguardTeam/AdGuardSDNSFilter/blob/master/Filters/parser.py

Basically, here's what you need:

  1. Fork https://github.com/AdguardTeam/AdGuardSDNSFilter/tree/master/Filters
  2. Clean exceptions.txt and exclusions.txt and rules.txt
  3. Edit filter.template and include the filter lists you'd like to have there
  4. Run parser.py

All 10 comments

but is it not safe to assume that even if they are listed as third-party, they could be blocked in the same way as ||test.com^

Tbh, it's hard to say what exact issues it will cause. It's easier to try and see how it goes.

I'll mark this as a feature request.

If anyone decides to help with this task, the changes need to be done here:
https://github.com/AdguardTeam/urlfilter/blob/master/dns_engine.go#L74

Request needs to be marked as "third-party".

Tbh, it's hard to say what exact issues it will cause. It's easier to try and see how it goes.
I'll mark this as a feature request.

Thanks! I appreciate your consideration :-)

I personally use prepared hosts file RuAdList + EasyList from repo by raletag (http://cdn.raletag.gq/rueasyhosts.txt). No idea how original filter list is converted into hosts file, but it is definitely useful for me and works perfectly. I suppose there should be some script converting EasyList records into hosts file records, despite most of "high-level" logic which is not supported by DNS.

@alexsannikov the problem with converting filters to standard host format is that you won't block all of the subdomains. E.g. 2o7.net, which is listed in your host file, has plenty of blacklisted sub-domains in other host files, but your host file lists only the top level entry.

I believe, although I could be wrong, that uBlock Origin classes each standard host file entry as ||something.com^ - Which would enable this list to work properly with the browser extension. But obviously things are processed differently with AdGuard.

I've extracted what I believe are the appropriate filters (restrictive and whitelist) from various sources over here: https://github.com/mmotti/adguard-home-filters (filters.txt).

The easiest way to do it is (in Python, at least):

1) Run a regex str replace on the filter strings:

# Remove lines that don't start with @@|| or ||
'^(?!(?:@@)?\|\|).+$': '',
# Remove $third-party suffix
'\$third-party$': '',
# Remove IP addresses
'^\|\|(?:[0-9]{1,3}\.){3}[0-9]{1,3}\^$': '',
# Remove empty lines
'^[\t\s]*(?:\r?\n|\r)+': ''

2) Match remaining entries against:
( I realise there is little validation here of whether the actual domain is valid)

filters = '^\|\|([a-z0-9-_.]+)\^(?:\$(?:third-party|document))?$'
wildcards = '^@@\|\|([a-z0-9-_.]+)\^(?:\||\$third-party)?$'

3) For each filter object: check whether test.com is in whitelist, or .test.com is in whitelist.
4) Reverse match against the whitelist for the partial string (.test.com) to identify the partial domain that matched --> domain.test.com
5) Collect verified whitelist items rules only(rules that have a conflicting restrictive filter rule) to an array / set. Otherwise we are importing lots of unnecessary whitelist entries that don't apply to our standard wildcard blocks.
6) Convert filter strings: test.com --> ||test.com^
7) Convert whitelist strings: domain.test.com --> @@||domain.test.com^

Or at least that's how I had to do it in Python.

I know this is probably blatantly obvious to you guys, but if it saves you any time at all, then it's worth me writing it down.

@mmotti You are right, without additional scripting using of plain hosts files is painful.
You have a great logic to prepare AdGuard Home filter lists from 3rd party hosts files, I personally use similar one: I sort the list and go through line by line. If I have any top-level domain I am leaving this record only and removing all the strings with sub-domains. Finally, I just add || at the beginning and thus all the sub-domains are blocked by default.
This logic makes sense, 'cause if we block root domain (TLD) it means all sub-domains should be blocked too. I never met requirement to block TLD but unblock sub-domain.
Anyway, as you already explained, we can 'whitelist' some domains/sub-domains with additional "@@ $important" imperative (don't forget about "important"!), which has higher priority than any other filter and allows required domain.
And yes, ||something.com^ is enough to block whole domain name.
P.S. I don't use pure hosts files, I collapse them (usually 2-3 times or even more), match records for all of them to avoid duplicates, and then add some AdGuard Home regexps (||, @@, ^, $important etc.).
My post was about the way how to get easily list of domains for blocking in "pure" format. Otherwise, some extra logic must be applied to EasyList or any other filtering list in AdBlock format to exclude any unsupported expressions. As I am not programmer, I chose the simplest way :)

Guys, if you're trying to convert adblock rules to the AG format, this might be useful:
https://github.com/AdguardTeam/AdGuardSDNSFilter/blob/master/Filters/parser.py

Basically, here's what you need:

  1. Fork https://github.com/AdguardTeam/AdGuardSDNSFilter/tree/master/Filters
  2. Clean exceptions.txt and exclusions.txt and rules.txt
  3. Edit filter.template and include the filter lists you'd like to have there
  4. Run parser.py

@ameshkov Wow, that's cool! Will look into that. Thanks!

Just to note with this:
I noticed that ||ospserver.net^$third-party from EasyPrivacy breaks Samsung phone updates.

It may be a one off, but it could also mean that a significant whitelist update could be required or the user will need to be aware that if they aren't using the specially provided AdGuard filter, that they may need to be more active with whitelisting.

Was this page helpful?
0 / 5 - 0 ratings