[Intentionally empty]
Regarding issue https://issues.adblockplus.org/ticket/2278:
@kzar, @ameshkov
Being able to have a token for regex-based filters would definitely help performance. However trying to programmatically extract a token from a regex-based filter sounds scary to me, too much risk of extracting erroneous tokens.
Suggestion: create a new filter option, token=[...], which filter creators can use to assign a predefined token to the filter. The creator of a filter is best placed to figure if and what token will work to store the filter internally.
For example, this filter in EasyList:
/\.filenuke\.com/.*[a-zA-Z0-9]{4}/$script
Could simply have been written by a filter creator:
/\.filenuke\.com/.*[a-zA-Z0-9]{4}/$script,token=filenuke
Hey guys! I was thinking about solving this issue a while ago. Even tried to implement a simple token-extracting algorithm. I will post my ideas a bit later though.
Meanwhile, here is a list of known regexp rules:
| /^(?![a-z]+\:\/+([^\/\:]+\.(il|com|net)|[\.0-9]+|([^\/\:\.]+\.)*(spot\.im|vine\.co|periscope\.tv|vid\.me|mako\.tools|minidom\.org|jquerymin\.org|logidea\.info|zoomanalytics\.co|firstimpression\.io))\.?([\/\:]|$))^[^\/\:\.]+\:\/+[^\/\:\.]/$third-party,domain=mako.co.il | EasyList Hebrew | https://github.com/AdBlockPlusIsrael/EasyListHebrew |
| /^(?![a-z]+\:\/+([^\/\:\.]+\.)*(google|icdn|auto|sport5|smartair|mysupermarket|blms|linicom)\.co\.il\.?([\/\:]|$))^[a-z]+\:\/+[^\/\:]+\.il\.?([\/\:]|$)/$third-party,domain=mako.co.il | EasyList Hebrew | https://github.com/AdBlockPlusIsrael/EasyListHebrew |
| /^[a-z]+\:\/+[\.0-9]+([\/\:]|$)/$image,media,object,script,stylesheet,subdocument,third-party,domain=mako.co.il | EasyList Hebrew | https://github.com/AdBlockPlusIsrael/EasyListHebrew |
| /^(?![a-z]+\:\/+([^\/\:\.]+\.)*(fbcdn|cloudfront|facebook|akamaihd|ctedgecdn|2mdn|uploaditnow|edgesuite|doubleclick|dmcdn|slideshare|advsnx)\.net\.?([\/\:]|$))^[a-z]+\:\/+[^\/\:]+\.net\.?([\/\:]|$)/$third-party,domain=mako.co.il | EasyList Hebrew | https://github.com/AdBlockPlusIsrael/EasyListHebrew |
| /^(?![a-z]+\:\/+([^\/\:\.]+\.)*(google|facebook|twitter|instagram|youtube|jquery|googleapis|vicomi|twimg|cdninstagram|pinterest|pinimg|giphy|playbuzz|outbrain|ytimg|amazonaws|cloudflare|gstatic|sniperm|dinovich|shortaudition|linkedin|opinionstage|vimeo|vimeocdn|dailymotion|flickr|staticflickr|tumblr|soundcloud|scribd|syteapi|addthis|addthisedge|reddit|disqus|disquscdn|apester|qmerce|taboola|taboolasyndication|google-analytics|googletagservices|googletagmanager|googleadservices|googlesyndication|h-cdn|scorecardresearch|serving-sys|bootstrapcdn|tiviclick|ruchlis|hotjar|flx1|mxpnl|themarker|adnxs|conduit|fourtips|makojs)\.com\.?([\/\:]|$))^[a-z]+\:\/+[^\/\:]+\.com\.?([\/\:]|$)/$third-party,domain=mako.co.il | EasyList Hebrew | https://github.com/AdBlockPlusIsrael/EasyListHebrew |
| /quang%20cao/ | ABPVN List | http://abpvn.com/ |
| /YanAds/ | ABPVN List | http://abpvn.com/ |
| /www/images/ | ABPVN List | http://abpvn.com/ |
| /ads-pic/ | Adblock-Persian list | http://ideone.com/K452p |
| /eshop-eca/ | Adblock-Persian list | http://ideone.com/K452p |
| /eshop98/ | Adblock-Persian list | http://ideone.com/K452p |
| /402x192/ | Adblock-Persian list | http://ideone.com/K452p |
| /^http://m\.autohome\.com\.cn\/[a-z0-9]{32}\//$domain=m.autohome.com.cn | ChinaList+EasyList | http://www.adtchrome.com/extension/adt-chinalist-easylist.html |
| /^http://www\.tt1069\.com\/(?!bbs)/$script,domain=tt1069.com | ChinaList+EasyList | http://www.adtchrome.com/extension/adt-chinalist-easylist.html |
| /^http://www\.iqiyi\.com\/common\/flashplayer\/[0-9]{8}/[0-9a-z]{32}.swf/$domain=iqiyi.com | ChinaList+EasyList | http://www.adtchrome.com/extension/adt-chinalist-easylist.html |
| /^http://www\.dnvod\.eu.*?\/[a-z0-9]{9,}\.swf/$domain=dnvod.eu | ChinaList+EasyList | http://www.adtchrome.com/extension/adt-chinalist-easylist.html |
| /NetInsight/text/$domain=~ads.pandora.tv|~opt.mgoon.com | Korean Adblock List | https://github.com/gfmaster/adblock-korea-contrib |
| /omniture/ | Korean Adblock List | https://github.com/gfmaster/adblock-korea-contrib |
| /NetInsight/html/ | Korean Adblock List | https://github.com/gfmaster/adblock-korea-contrib |
| /cgi-bin/conad.fcgi/ | Korean Adblock List | https://github.com/gfmaster/adblock-korea-contrib |
| /acecounter/$domain=~acecounter.com | Korean Adblock List | https://github.com/gfmaster/adblock-korea-contrib |
| /adNdsoft/ | Korean Adblock List | https://github.com/gfmaster/adblock-korea-contrib |
| /wisenut/ | Korean Adblock List | https://github.com/gfmaster/adblock-korea-contrib |
| /ad-pay/ | Korean Adblock List | https://github.com/gfmaster/adblock-korea-contrib |
| /wp-content/plugins/google-analyticator/ | Korean Adblock List | https://github.com/gfmaster/adblock-korea-contrib |
| /realclick/ | Korean Adblock List | https://github.com/gfmaster/adblock-korea-contrib |
| /max-banner-ads-pro/ | Korean Adblock List | https://github.com/gfmaster/adblock-korea-contrib |
| /RealMedia/ | Korean Adblock List | https://github.com/gfmaster/adblock-korea-contrib |
| /bannerManager/ | Korean Adblock List | https://github.com/gfmaster/adblock-korea-contrib |
| /autoPage/ | Korean Adblock List | https://github.com/gfmaster/adblock-korea-contrib |
| /overture/ | Korean Adblock List | https://github.com/gfmaster/adblock-korea-contrib |
| /wiseAd/euckr/inc/$subdocument | Korean Adblock List | https://github.com/gfmaster/adblock-korea-contrib |
| /NetInsight/js/ | Korean Adblock List | https://github.com/gfmaster/adblock-korea-contrib |
| /scrap_logs/ | Korean Adblock List | https://github.com/gfmaster/adblock-korea-contrib |
| /banner_event/ | Korean Adblock List | https://github.com/gfmaster/adblock-korea-contrib |
| /images/adpresso/ | Korean Adblock List | https://github.com/gfmaster/adblock-korea-contrib |
| /AdBanner/ | Korean Adblock List | https://github.com/gfmaster/adblock-korea-contrib |
| /cdsbData_gal/bannerFile/$image,domain=mybogo.net|zipbogo.net | List-KR | https://list-kr.github.io/ |
| /nad/media/ | List-KR | https://list-kr.github.io/ |
| /ajrotator/ | Filtros Nauscopicos | http://nauscopio.nireblog.com/cat/filtrado |
| /:\/\/(?!biuropodrozy)(?!liveblog)(?!relacje)(?!opinie)(?!zalacznik)(?!magazyn)(?!newsletter)(?!rodzinnawycieczka)(?!doladowania)(?!fantasyliga)(?!funduszeue)(?!imperiumstylu)(?!kodyrabatowe)(?!ogloszenia)(?!orangekinoletnie)(?!rekrutacja)(?!rycerzeiksiezniczki)(?!speedwaymanager)(?!sportowefakty)(?!sportowybar)(?!talesofmagic)(?!ubezpieczenia)(?!warofdragons)(?!wiadomosci)[a-zA-Z0-9]{10,}\.wp.pl\// | Adblock polskie regu艂y | http://certyficate.it/polski-filtr-adblock/ |
| /:\/\/(?!biuropodrozy)(?!liveblog)(?!relacje)(?!opinie)(?!zalacznik)(?!magazyn)(?!newsletter)(?!facet)(?!wyleczto)(?!kuchnia)(?!film)(?!moto)(?!gwiazdy)(?!teleshow)(?!finanse)(?!kobieta)(?!dom)(?!pogoda)(?!tech)(?!historia)(?!czat)(?!ksiazki)(?!gryonline)(?!hotele)(?!narty)(?!samoloty)(?!wycieczki)(?!hosting)(?!irlandia)(?!multikurs)(?!casino)(?!foto)(?!tech)(?!www)(?!stg)(?!doladowania)(?!fantasyliga)(?!funduszeue)(?!imperiumstylu)(?!kodyrabatowe)(?!alefolwark)(?!angielski)(?!arenamody)(?!beniamin)(?!bon)(?!bsg)(?!casino)(?!diety)(?!dlaprasy)(?!dlugi)(?!doladowania)(?!dom)(?!dysk)(?!ebiznes)(?!ebooki)(?!empire)(?!fantasyliga)(?!film)(?!fundusze)(?!ogloszenia)(?!orangekinoletnie)(?!rekrutacja)(?!rycerzeiksiezniczki)(?!speedwaymanager)(?!sportowefakty)(?!sportowybar)(?!talesofmagic)(?!ubezpieczenia)(?!warofdragons)(?!wiadomosci)(?!gazetki)(?!gry)(?!horoskop)(?!kalendarz)(?!katalog)(?!khanwars)(?!komiks)(?!konflikty)(?!kontakty)(?!korsarze)(?!kultura)(?!mini)(?!mmho)(?!mobilna)(?!morizon)(?!moto)(?!muzyka)(?!narty)(?!naryby)(?!onas)(?!orangekinoletnie)(?!piraci)(?!poczta)(?!pomoc)(?!praca)(?!profil)(?!programtv)(?!pytamy)(?!rekrutacja)(?!rss)(?!rtvagd)(?!rycerzeiksiezniczki)(?!smeet)(?!speedwaymanager)(?!szkola)(?!szukaj)(?!tech)(?!teleshow)(?!triviador)(?!turystyka)(?!twojeip)(?!ulubiency)(?!warodfragons)(?!wycieczki)(?!zdrowie)(?!zoomumba)(?!topnews)(?!erotyka)(?!dzieci)(?!fitness)(?!gielda)(?!finansomat)(?!biznes)(?!sport)[a-zA-Z0-9]{4,9}\.wp.pl\// | Adblock polskie regu艂y | http://certyficate.it/polski-filtr-adblock/ |
| /commoncfm/images/microsoftxboxone/$domain=buffed.de|gamesaktuell.de|gamezone.de|pcgames.de|videogameszone.de | German filter | http://adguard.com/filters.html#german |
| /[a-z0-9]{32,}/$third-party,domain=picshare.ru | Russian filter | http://adguard.com/filters.html#russian |
| /[a-zA-Z0-9]{35,}/$script,third-party,domain=bigtorrent.org|bigtorrents.ru|cashtube.ru|cmexota.ru|dreamprogs.net|dsvload.net|ecsebo.ru|enotbox.com|faspiic.ru|imagefile.org|imgpay.ru|kordonivkakino.net|mcdownloads.ru|mega-pic.org|odnopolchane.net|payforpic.ru|pic4cash.ru|pic4you.ru|picclick.ru|picforall.ru|pics-money.ru|pirat-pic.ru|planeta51.com|pronpic.org|prons.org|q32.ru|rustorrents.net|santikov.net|sharezones.biz|torrent-pirat.com|unionpeer.org|uraltrack.net|viewy.ru|xhamster-pic.com | Russian filter | http://adguard.com/filters.html#russian |
| /http:\/\/rustorka.com\/[a-z]+\.js/$domain=rustorka.com | Russian filter | http://adguard.com/filters.html#russian |
| /http:\/\/rustorka.com\/[a-z0-9]+\.(jpg|gif)/$image,domain=rustorka.com | Russian filter | http://adguard.com/filters.html#russian |
| /[a-zA-Z0-9]{35,}/$domain=anime-free.net|cyberpirate.me|imgbum.net|online-porno-hd.ru|tecnomectrani.com | Russian filter | http://adguard.com/filters.html#russian |
| /[a-z0-9]{30,}/$script,third-party,domain=free-torrent.org|free-torrents.org | Russian filter | http://adguard.com/filters.html#russian |
| /^http://[a-z0-9_]{15,}\.[a-z0-9-]+\.[a-z]{2,}\/.*[a-zA-Z0-9]{100,}/$object-subrequest,domain=wat.tv | Liste FR | http://adblock-listefr.com/ |
| /^http://[a-z0-9_-]{10,}\.[a-z0-9-]+\.[a-z]{2,}\/.*?\w{30,}/$~xmlhttprequest,domain=gentside.com|maxisciences.com|ohmymag.com | Liste FR | http://adblock-listefr.com/ |
| /content/stargate/$domain=hlamer.ru|kadu.ru|krasview.ru | RU AdList | https://code.google.com/p/ruadlist/ |
| /output/index/$third-party,script | RU AdList | https://code.google.com/p/ruadlist/ |
| /https?://(?!(mc\.yandex\.ru|www\.google-analytics\.com)/)/$third-party,script,subdocument,domain=massivmebel.by | RU AdList | https://code.google.com/p/ruadlist/ |
| /^https?://goodgame\.ru/[a-z0-9]+$/$subdocument,domain=goodgame.ru | RU AdList | https://code.google.com/p/ruadlist/ |
| /wp-content/plugins/popup-maker/$domain=info-life.in.ua|intermarium.com.ua|paragraf.net.ua|unn24.com.ua|varota.com.ua | RU AdList | https://code.google.com/p/ruadlist/ |
| /^https?://(?!static\.)([^.]+\.)+?fastpic\.ru[:/]/$script,domain=fastpic.ru | RU AdList | https://code.google.com/p/ruadlist/ |
| /images/brandings/$image,domain=sc2tv.ru | RU AdList | https://code.google.com/p/ruadlist/ |
| /default/vbanners/$domain=noi.md | RU AdList | https://code.google.com/p/ruadlist/ |
| /branding/$subdocument,domain=fanserials.tv|kino-filmi.net | RU AdList | https://code.google.com/p/ruadlist/ |
| /serial_adv_files/$image,domain=xn--80aacbuczbw9a6a.xn--p1ai|泻褍褉邪卸斜邪屑斜械泄.褉褎 | RU AdList | https://code.google.com/p/ruadlist/ |
| /^https?://(?!www\.)([^.]+\.)+?(kordonivkakino\.net|m(ac-torrent-download\.net|oviki\.ru))[:/]/$script | RU AdList | https://code.google.com/p/ruadlist/ |
| /popupclick/$popup | RU AdList | https://code.google.com/p/ruadlist/ |
| /http://[a-zA-Z0-9]+\.[a-z]+\/.*(?:[!"#$%&'()*+,:;<=>?@/\^_`{|}~-]).*[a-zA-Z0-9]+/$script,third-party,domain=keezmovies.com|redtube.com|tube8.com|tube8.es|tube8.fr|www.pornhub.com|youporn.com | EasyList | https://easylist.github.io/ |
| /\/[0-9].*\-.*\-[a-z0-9]{4}/$script,xmlhttprequest,domain=gaytube.com|keezmovies.com|spankwire.com|tube8.com|tube8.es|tube8.fr | EasyList | https://easylist.github.io/ |
| /\.sharesix\.com/.*[a-zA-Z0-9]{4}/$script | EasyList | https://easylist.github.io/ |
| /\.filenuke\.com/.*[a-zA-Z0-9]{4}/$script | EasyList | https://easylist.github.io/ |
| /^http://m\.autohome\.com\.cn\/[a-z0-9]{32}\//$domain=m.autohome.com.cn | EasyList China | http://abpchina.org/forum/ |
| /^http://www\.iqiyi\.com\/common\/flashplayer\/[0-9]{8}/[0-9a-z]{32}.swf/$domain=iqiyi.com | EasyList China | http://abpchina.org/forum/ |
| /^http://www\.dnvod\.eu.*?\/[a-z0-9]{9,}\.swf/$domain=dnvod.eu | EasyList China | http://abpchina.org/forum/ |
| /^http://www\.tt1069\.com\/(?!bbs)/$script,domain=tt1069.com | EasyList China | http://abpchina.org/forum/ |
| /ulightbox/$domain=hdkinomax.com|tvfru.net | RU AdList: BitBlock | https://code.google.com/p/ruadlist/ |
| /http://cdn[0-9]\.spiegel\.de/images/image-([^-]+)-[^-]+-[^-]+-(?!\1)[^-]+\.jpg/$image,domain=spiegel.de | EasyList Germany | https://easylist.github.io/ |
Please note the number of rules which are mistakenly made regexp-type.
@gorhill I've not been involved in that issue so far, so just done a quick bit of reading. I might get some things wrong.
While I agree that grabbing a keyword from the regexp seems scary, I'm not sure how the suggested token option would help. Take your filenuke example, there the automatic keyword would have been "filenuke" anyway.
Now if you think of a more advanced example which matches one of two possible domains, what would you put for the token option? If you chose to use parts of one of the domain as a keyword you'd end up not matching the other domain. Instead you'd have to omit the token option, which would end up as the same result as the automatic approach. (Since they mention that those kind of strings should be ignored.)
(I wonder if we could copy the content blocking approach of compiling all these regular expressions into a finite state machine? That could be a way to make matching regular expression filters faster without worrying about keywords.)
(I wonder if we could copy the content blocking approach of compiling all these regular expressions into a finite state machine? That could be a way to make matching regular expression filters faster without worrying about keywords.)
- This would be an overkill
- In order to do it they have restricted regular expressions support to a very limited subset.
Take your filenuke example
Yes, bad example. Here is another one found in EasyList:
/\/[0-9].*\-.*\-[a-z0-9]{4}/$script,xmlhttprequest,domain=gaytube.com|keezmovies.com|spankwire.com|tube8.com|tube8.es|tube8.fr
Not sure if a token was available for this one -- whoever created the filter knows, but mainly my point is that token= option, would be an easy low-tech way available immediately (easy implementation) to deal with this, with no need for a regex parser (which would fail anyway with the filter here). If no token is present for untokenizable filter, then we just end up with the current behavior.
Let's first think about what issue we are trying to solve.
First of all, domain-restricted filters are not a problem as there is no influence on the overall performance.
I suppose, that what we really need is to reduce the negative impact of the mistakes made by filters authors. For instance, the filters like /ajrotator/ and such. There is no problems with extracting a token from a rule like this.
Here is just a dirty example of a token extracting function:
var extractToken = function(ruleText) {
// Get the regexp text
var reText = ruleText.match(/\/(.*)\/(\$.*)?/)[1];
var specialCharacter = "...";
if (reText.indexOf('(?') >= 0 || reText.indexOf('(!?') >= 0) {
// Do not mess with complex expressions which use lookahead
return null;
}
// (Dirty) prepend specialCharacter for the following replace calls to work properly
reText = specialCharacter + reText;
// Strip all types of brackets
reText = reText.replace(/[^\\]\(.*[^\\]\)/, specialCharacter);
reText = reText.replace(/[^\\]\[.*[^\\]\]/, specialCharacter);
reText = reText.replace(/[^\\]\{.*[^\\]\}/, specialCharacter);
// Strip some special characters
reText = reText.replace(/[^\\]\\[a-zA-Z]/, specialCharacter);
// Split by special characters
var parts = reText.split(/[\\^$*+?.()|[\]{}]/);
var token = "";
var iParts = parts.length;
while (iParts--) {
var part = parts[iParts];
if (part.length > token.length) {
token = part;
}
}
return token;
};
I've tried this function with the rules above and here is the result:
https://ameshkov.github.io/web/regex-tokens.html?1
What for the token proposition, here are the downsides I see:
getadblock guys aren't invited to our party
They are using ABP's filtering engine since AdBlock v3.0. See https://github.com/kzar/watchadblock/releases/tag/3.0.
The other points still stand though:)
I wasn't aware of the many erroneous regex filters, looks like this can be easily addressed with a trivial code for these cases.
Mainly it was just to throw an idea out there, since these untokenizable filters have always bothered me[1], and I knew there was an issue like this opened on ABP issue tracker -- so I just threw the idea out there to have an easy fix, worth only if actually used by filter list maintainers.
Anyway, I will just use this issue here to throw ideas once in a while which I think might be good for all blockers[2], especially when it comes to make the life of filter list maintainers easier.
[1] I was looking to even skip testing for domain hit -- but this is an implementation-dependent detail I suppose
[2] I understand that when a filter syntax is not supported by ABP, EasyList et al. maintainers won't use it.
[2] I understand that when a filter syntax is not supported by ABP, EasyList et al. maintainers won't use it.
By the way, I'd like to raise a question about the non-standard syntax.
You have recently added a couple of pseudo-classes extending element hiding rules syntax. I am talking about :has(), :xpath(), :matches-css [1] and such.
The idea is really great and we will support some of these extended selectors as well (:has() and :contains() are currently in the beta testing stage, :matches-css() is coming).
However, there is one issue that bothers me. The syntax you use (pseudo-classes syntax) is not backward-compatible and it will break good old stylesheet-based ad blockers like Adguard and ABP.
/* browser will ignore the whole style due to the second selector */
#banner, #banner:has(.test) { display: none; }
I suggest introducing a backward-compatible syntax along with the modern pseudo-classes-based one.
Backward compatible synonym for :has(...) will be [-ext-has="..."]
Backward compatible synonym for :matches-css(...) will be [-ext-matches-css="..."]
Backward compatible synonym for :xpath(...) will be [-ext-xpath="..."]
[1] As I understand, there is a backward compatible :matches-css() option already: https://issues.adblockplus.org/ticket/2390
You have recently added a couple of pseudo-classes extending element hiding rules syntax. I am talking about :has() ...
FWIW We are working towards adding the :has selectors too https://issues.adblockplus.org/ticket/3143
Anyway, I will just use this issue here to throw ideas once in a while which I think might be good for all blockers[2], especially when it comes to make the life of filter list maintainers easier.
:+1: Please do, I think collaboration benefits us all.
@kzar so, what do you think about the backward compatible syntax proposition?
@kzar regarding Lain's comment:
I think it's worth mentioning that :has() selector must work in combination with -abp-properties. So, filter like site.name##.block:has([-abp-properties="background: yellow"])
Using proposed syntax it could look like this:
##.block[-ext-has="*:matches-css(background: yellow)"]
@ameshkov Well I think the idea is that when browsers eventually support :has selectors those filters will be again using standard CSS selectors anyway. We only need to implement special logic for those filters in the mean time as a stop-gap. I guess it's true (and unfortunate) that the syntax will break filters for ad blockers which haven't added support for now, but I guess that's not too bad since uBlock, AdGuard and Adblock Plus all plan to support them. (Also because they are only planned to be something used as a last resort.)
As for the general point of using backward compatible syntax like you've suggested, I think it's a good idea. (We already do something like that for CSS property filters using the -abp-properties attribute.)
Well I think the idea is that when browsers eventually support :has selectors those filters will be again using standard CSS selectors anyway.
True. However, here is one more argument for that type of syntax. We all support a lot of different browsers (including mobile and such) and trying to use pseudo-classes syntax requires us to do it simultaneously for all the platforms. While backward-compatible syntax allows us to roll this feature out gradually.
As for the general point of using backward compatible syntax like you've suggested, I think it's a good idea. (We already do something like that for CSS property filters using the -abp-properties attribute.)
Yeah, I know, that's why I was surprised by the implementation proposed in the issue 3143.
I suggest introducing a backward-compatible syntax along with the modern pseudo-classes-based one.
I will support the backward-compatible syntax where possible, but personally, internally I prefer using the :() syntax. I see these new operators as nodes in a processing graph, and thus being able to easily and freely combine them I see this as a requirement for the future. Example[1]:
div.red:has(div.blue:matches-css(position: fixed;):contains(allo)):contains(publicit茅)
It does feel to me like a backward-compatible syntax would complicate writing such filters (especially the use of quotes):
div.red[-ext-has="div.blue[-ext-matches-css=\"position: fixed;\"][-ext-contains=\"allo\"]"][-ext-contains="publicit茅"]
Aren't you validating element hiding filters at load time (or else using invalid CSS selector would break element hiding) so isn't true that old versions will discard filters with this new syntax? (Element:matches('div:has(span)') would throw).
[1] Ok, the example is contrived, but it's just to illustrate easily combining such filters.
It does feel to me like a backward-compatible syntax would complicate writing such filters (especially the use of quotes):
Yeah, frankly, when I check something, I prefer to use the newer syntax as well.
However, it's not that bad, there's no need to support it inside of a composite filter.
Here, look at this example:
div.red[-ext-has="div.blue:matches-css(position: fixed):contains(allo):contains(publicit茅)"]
Aren't you validating element hiding filters at load time (or else using invalid CSS selector would break element hiding) so isn't true that old versions will discard filters with this new syntax? (Element:matches('div:has(span)') would throw).
Nope, in fact it was all of a sudden for us:) Also there's no way we could do it in desktop and mobile versions.
@gorhill one more thing regarding the :matches-css(). I propose using a bit different syntax for it.
Could you please read this issue description and tell me what you think about it?
https://github.com/AdguardTeam/ExtendedCss/issues/7
Q: Why additional pseudo-classes for matching before and after
I already support selector:after:style-properties(pattern), I just extract the :after before using the selector at setup time. But I would not mind selector:style-properties-before(pattern) -- it would just make the setup code a bit simpler.
Q: Why pattern-matching?
I agree with (optional) pattern matching. Pattern-matching is not something I implemented, but I don't see a problem supporting this. For the implementation side of such filter however, I would just want to be sure its semantic does not force a very specific implementation.[1]
I suppose that using this approach we could also cover existing abp-properties rules
Note that ABP's -abp-properties has been implemented with a very different semantic in mind than something like :matches-css: to reverse lookup CSS rules. Such filters shouldn't be used directly on a set of nodes for filtering purpose. The purpose of all the filters I have been adding lately are to reduce a set of nodes (starting with one as small as possible), so the suffix part is key, to start with the smallest set of nodes possible is key for performance.
For example, a filter such as wetter.com##[-abp-properties='margin-left: 24px'], given that it has no suffix selector, would have to be tested for all elements on a page, which would just kill performance.
[1] I see using cssText as a potentially high overhead approach, so I went with the dictionary approach, to test only for the enumerated properties. a) I suspect the cssText string is generated on the fly by the browser when "getted"; b) using cssText forces the use of a regex which will apply to a potentially large string.
I already support selector:after:style-properties(pattern)
It may look pretty good, but it bothers me that :after in fact can't be part of a valid selector as pseudo-element cannot be selected. I suppose it could mislead a filter author.
[1] I see using cssText as a potentially high overhead approach, so I went with the dictionary approach, to test only for the enumerated properties. a) I suspect the cssText string is generated on the fly by the browser when "getted"; b) using cssText forces the use of a regex which will apply to a potentially large string.
Yep, I've run into a number of issues while implementing it. For now I've used a cross-browser function for extracting the cssText string:
https://github.com/AdguardTeam/ExtendedCss/blob/feature/issues/7/lib/style-property-matcher.js#L96
Also I agree with you on the enumerated properties approach. There's no need in building the cssText field, I will change the current implementation.
For example, a filter such as wetter.com##[-abp-properties='margin-left: 24px'],
Yeah, you're right. Also now when I know how this type of rules work, I find it a bit misleading. At least I think Lain_13 does not understand how it works.
@kzar what do you think about implementing something more "straightforward"?
I guess if we use the properties approach and agree on *-before/after postfix, there is no need for me to use another name for that pseudo class. matches-css, matches-css-before and matches-css-after sounds good and describes the filter behaviour very well.
matches-css,matches-css-beforeandmatches-css-aftersounds good and describes the filter behaviour very well.
I agreed with this. This new selector, combinable with :has() is going make filter list maintainers' life easier.
I've updated the syntax description:
https://github.com/AdguardTeam/ExtendedCss/issues/7
Looking into this specific case this morning: https://github.com/uBlockOrigin/uAssets/issues/110.
This would be solvable without exception filters if it was possible to outright remove the targeted nodes from the DOM:
finanzen.net###bodyCenter > div[id]:has(:scope > #Ads_BA_Sky):remove()
The current implicit action to take on targeted nodes is to hide them. However, being to re-style has make the job of working against anti-blocker mechanisms much easier (AdGuard support this).
Additionally, being able to remove nodes from the DOM is something I have found would take care of many other cases as well (I do believe AdGuard support this in some ways, not sure). From my point of view, being forced to whitelist network requests from 3rd-party advertisers/trackers is always the worst option, and we should extend the capabilities of cosmetic filtering (element hiding) to avoid such whitelisting.
Oh, you have finally faced these german wunderwaffe-anti-adblock-solutions:)
I was impressed when I saw this particular script for the first time.
Currently the easiest way to circumvent it is to inject a script like this:
Object.defineProperty(window, `UABPtracked`, { get: function() { return true; }, set: function() {} })
Regarding the DOM nodes removal thing, I need some time to think about it.
Currently the easiest way to circumvent it is to inject a script like this
I didn't realize they were using the uabp thing, I already had a scriptlet to take care of these -- it was not injected on that site.
Though in the long term, scriplets require more work and maintenance, and I would rather use generic cosmetic filter syntax where possible. In the current case, a node removal would work. It would also work for that case (edit: never mind, would not work for this case). Anyway, something to think about.
In the current case, a node removal would work
However, in this particular case node removal is not the best solution. This anti-adblock script is pretty ugly, it sets up a timer and redraws ads every 5 or so seconds. And with nodes removed it continues to do something with DOM.
Talking about anti-adblock scripts, I really do not see a good declarative solution which does not involve scripting.
Let's start with analysis. Most of the things we discuss are directly caused by the websites trying to circumvent ad blocking.
Basically, there are two approaches:
Point 1 can be solved by the new pseudo-elements (at least for now).
Point 2 can be solved by scripts (like reek's AAK for instance).
Btw, reek is the best anti-adblock scripts expert I know, let's ask his opinion.
@gorhill @ameshkov We are discussing WebSocket circumvention on the Adblock Plus issue tracker, but unfortunately we've had to make the issue confidential. (Guess why...) Anyway I'd like to copy you both in on the issue, as mapx pointed out it would be good to get your feedback there too.
Are you guys signed up on our issue tracker? If so what are your usernames?
@gorhill Also a possibly dumb question, doesn't a Content Security Policy like connect-src http:; frame-src http: also prevent https connections?
doesn't a Content Security Policy like
connect-src http:; frame-src http:also prevent https connections?
Not according to spec:
The URL matching algorithm now treats insecure schemes and ports as matching their secure variants. That is, the source expression
http://example.com:80will match bothhttp://example.com:80andhttps://example.com:443.
Guess why...
They will see it anyway:)
Are you guys signed up on our issue tracker? If so what are your usernames?
Just signed up, username is ameshkov
@gorhill Ahh, makes sense.
@ameshkov Cool, added you to the issue.
@gorhill, @ameshkov Heads up, we're going to consider WebSocket requests as the type "websocket" instead of "other" in the future. More details in this blog post: https://adblockplus.org/development-builds/new-filter-type-option-for-websockets
@kzar hey Dave, thanks for the heads up.
@gorhill @kzar
Btw, have you already seen the bleeding edge technology: loading ads code through RTCPeerConnection?
have you already seen the bleeding edge technology
Yes, first time I saw it on Merriam-Webster's site.
Any idea besides wrapping RTCPeerConnection?
So far, no -- aside giving users the option of disabling entirely WebRTC.
@ameshkov No, I did not realise people already started abusing WebRTC. Man. :-1:
Do you guys have an URL for an example of a website using WebRTC for circumvention that I can take a look at?
Actually would you mind removing that comment here?
Done;)
So far, no -- aside giving users the option of disabling entirely WebRTC.
Does it really work in Chrome? I thought it is a bit limited.
Do you guys have an URL for an example of a website using WebRTC for circumvention that I can take a look at?
Code example:
https://forum.adguard.com/index.php?threads/block-rtcpeerconnection.13808/#post-102128
I'd rather discuss our WebSocket plans in the issue on our tracker, since it's marked confidential
I understand not discussing ideas of workarounds for our own blocking solutions, but here I don't see the point, the websocket issue came about because it's already used out there.
@ameshkov Thanks!
@gorhill There's a new issue I'd like to involve you with but can't unless you have a user on our issue tracker. Mind creating one?
Is this about the /g00 thing?
Yup
Any idea on how do they detect dev tools?
Yup
Again in this case, I don't see the point of secrecy, the /g00 stuff is being currently used in production, so it's not like we are trying to prevent a work around blockers, they are already being worked around. That HTML/CSS/JS code is all open to scrutiny by anybody, so there is no privileged information to protect really.
Any idea on how do they detect dev tools?
Yes: https://www.reddit.com/r/firefox/comments/5gtedd/ublock_origin_developer_raymond_hill_on/dav4iiu/
Thumbs up, nice technique!
Proposed network filter syntax extension: **[...]{...} -- two-asterisk syntax.
The ** sequence tells the filter parser that a regex-valid character class follows -- and optionally a regex-valid quantifier.
The [...] part would be a regex-valid character class specifier.
The {...} is optional. If present, passed as is to the regex constructor, i.e. it would be a regex-valid quantifier. If not present, the "zero or more" * quantifier will be used.
What it solves: better matching accuracy without having to resort to inefficient regex-based filter -- thus no issue with tokenizing the filter. Benefits of both plain filter syntax and regex-based syntax without their liabilities.
Example, an abused exception filter found in EasyList:
@@||nowdownload.*/banner.php?$script,domain=~calcalist.co.il|~gaytube.com|~mako.co.il|~pornhub.com|~redtube.com|~redtube.com.br|~tube8.com|~tube8.es|~tube8.fr|~walla.co.il|~xtube.com|~ynet.co.il|~youjizz.com|~youporn.com|~youporngay.com
With the new syntax, it can't be abused anymore (showing more than one variation to highlight flexibility):
@@||nowdownload.**[a-z]/banner.php?$script
@@||nowdownload.**[a-z]{2}/banner.php?$script
@@||nowdownload.**[a-z]{2,3}/banner.php?$script
@@||nowdownload.**[^/?#]/banner.php?$script
This is a real case example, and currently the proposed solution means there would no longer be a need to create an exclusion list for where the filter should not apply (pornhub et al. have been abusing it, along other such similar filters).
Efficiency: the more specific a regex the more efficiently it executes. The * syntax is commonly used and it means "matches anything in any number". The ability for filter list maintainer to be more accurate in describing what is to be expected can lead to more efficient regexes internally.
For example:
/site=*/size=*/viewid=
Let's say that the * were supposed to match some random sequence of digits. Of course whoever created the filter was not going to use a regex for such filter -- because they are rightfully frowned upon. On the other hand, the matching-anything-in-any-number semantic of the asterisk means that an inefficient regex must be used internally, one that will scan the whole URL to match. With the proposed syntax:
/site=**[\d]/size=**[\d]/viewid=
This is a much more efficient filter, as the regex execution will bail out of matching as soon as no digit is found at the placeholder locations. This gives the filter list maintainers the ability to be more accurate in describing what the filter should match, without resorting to full blown regex-based filters.
A nice-to-have side effect for filter list maintainers: ability to specify a sequence of character with no instances of specific characters in it, i.e. the [^...] regex syntax.
A filter parser would just need to have a special code path in case a filter is found to contains an instance of double asterisk, to extract and validate the sequence **[...]{...} and transpose into the proper regex equivalent sequence, or if it does not validate, just fall back on the normal single asterisk semantic for the sequence.
This does not mean all instances of * would need to be replaced, it's merely a new syntax which would become available to filter list maintainers to make their work easier/simpler.
Sorry for the late reply, I've just got a free minute to think about it in silence:)
@gorhill you know, it looks very much like your token suggestion.
Comparing with the regular regexp rules:
As I see it, the problem is that filters maintainers generally don't care about performance. One more regex rule would do no harm, so they will continue to use them.
Frankly, I suppose we should make engines smarter instead of providing more and more syntax sugar to maintainers.
Examples:
||nowdownload.*/ - we can detect, that nowdownload.* is the domain name, so we can compile a more effective regexp.Regarding point 1, we did implement it in the latest version. I guess we need some time to see how it goes and is there any problem with token extraction.
We're supporting the $webrtc filter option / request type now as well, here's the blog post and here's the implementation. We also do this by wrapping RTCPeerConnection but our implementations have some differences, most notably that instances without any URLs are not listed. Sorry I meant to post here earlier but forgot!
Edit: Oh, and I also filed Chromium issue 707683 asking for proper support to be added for the blocking of WebRTC connections.
@kzar Thanks for the head up. I do follow the issue tracker of ABP, so I was aware of this. I have been thinking of I how would implement this in uBO, but I will definitely avoid using a wrapper in uBO on my side, at least an unconditional one -- I consider this too risky, and in the event it causes an unforeseen issue, a user would be forced to disable the whole extension itself.
There is such a wrapper for uBO-Extra, and issues have been reported, see https://github.com/gorhill/uBO-Extra/issues -- you may want to use these as test cases. Having to disable a small companion extension is much less worst than having to disable the whole blocker.
Regarding https://bugs.chromium.org/p/chromium/issues/detail?id=707683, another approach is to have a content security policy directive for WevRTC connections, currently there is nothing to prevent these, a rather big hole in the CSP standard. See https://github.com/w3c/webappsec-csp/issues/92.
On the other hand, there's not much can be done about WebRTC circumvention. Either wrap it or use the scripts injections approach which ABP does not support.
but I will definitely avoid using a wrapper in uBO on my side, at least an unconditional one
Actually, there is a way to override WS/WebRTC "conditionally". We've found a way to execute dynamic script injections before the pages' code. Not a 100% guarantee yet, though, but the first tests show that it works.
Wait a bit, I'll link you an example.
The "fastest" way to perform a dynamic script injection is to use the onCommitted listener. Once it fires, send a message with the scripts directly to the frame and handle it in the content script. It appears, that the message is received by the content script before the page scripts are executed.
Problems:
@ameshkov That is an interesting idea. On my side I was planning to maybe use a cookie to conditionally execute such content script code:
Not pretty and potentially have its own edge issues -- aside the added overhead.
The problem with executeScript and insertCSS, is that they are not well defined, and main chrome API statement...
Unless the doc says otherwise, methods in the chrome.* APIs are asynchronous: they return immediately, without waiting for the operation to finish
... really get its the way of reliability for the current case.
This suggests that tabs.executeScript (and also tabs.insertCSS) cannot be reliably injected in a tab/frame. The fact that these methods accept a run_at option makes all this the more ambiguous -- what is the point of using document_start if there is no guarantee that the script or CSS will be injected before any CSS or script has be executed on the page?
From the documentation, one can even imagine that a script/CSS injected through these methods could potentially end up being injected in a completely different tab/frame than originally intended (what happens when there is a quick redirect?). I consider this a flaw in the API.
If the documentation could explicitly guarantee that your approach will always work (i.e. the script/CSS is injected in a synchronous manner and onCommitted can be blocking), that would be great -- it actually make sense given when webNavigation.onCommitted is fired:
Fired when a navigation is committed. The document (and the resources it refers to, such as images and subframes) might still be downloading, but at least part of the document has been received
But then again, the way this is phrased, it's as if onCommitted could be fired after things are farther than just right after the document was created ("might").
@gorhill nice catch with a cookie, I like it! What's good is that this is a true cross-browser solution (taking into account the MS Edge limitations).
The problem with executeScript and insertCSS, is that they are not well defined, and main chrome API statement...
As I understand, tabs.executeScript is out of the question anyway, as it executes a content script, not an in-page script: https://developer.mozilla.org/en-US/Add-ons/WebExtensions/API/tabs/executeScript
edit: ignore it, obviously the content script can inject the in-page wrapper.
If the documentation could explicitly guarantee that your approach will always work (i.e. the script/CSS is injected in a synchronous manner and onCommitted can be blocking)
I don't think it can, we'll have to keep an eye on its behavior, and that's the problem indeed.
@gorhill @kzar
Whichever approach is used, I suppose we need a common rules syntax for disabling this kind of wrappers.
I suggest introducing this type of rules:
@@*$websocket,domain=example.org -- to disable websocket wrapper
@@*$webrtc,domain=example.org -- to disable RTC wrapper
Thoughts?
The idea of conditionally running content scripts is certainly interesting, and not something I considered. Copying in @snoack since he might be interested in your ideas there too.
I think in the case of these wrappers however we'll stick with executing them consistently. IMO it's better to "just" get the wrapper right in the first place, and multiple code paths make debugging harder. It's good for you guys I suppose, after the next Adblock Plus release lands we'll find out the hard way if there are any problems, if not you can use the code too :stuck_out_tongue:.
Guys, I've just stumbled upon a new circumvention practice.
I guess @kzar is aware of it, not sure about @gorhill:
if (window.document) {
if (window.adonisContext)
return window.adonisContext;
var e, n = document.createElement("iframe");
return n.src = "https://nop.xpanama.net/if.html?adflag=1&cb=" + i(),
n.setAttribute("style", "display: none;"),
document.body.appendChild(n),
e = n.contentWindow,
n.contentWindow.stop(),
window.adonisContext = e,
e
}
return window
It seems that n.contentWindow.stop(), prevents the content script from doing its job. Also, with a real URL in the src they are able to bypass CSP restrictions (http:). I guess the straightforward solution would be to override document.createElement and document.__proto__.createElement
See https://github.com/uBlockOrigin/uAssets/issues/190#issuecomment-300897354.
My underfstanding , they can call n.contentWindow.stop() because at that point the iframe is about:blank (this is what chrome://webrtc-internals/ shows), which is treated in a special way by CSP engine (it always inherit embedding document's CSP). I realized yesterday it was not obvious how to disallow about:blank specifically without disallowing too much. However, in the end, frame-src http://*/* https://*/* worked (because there is no path in about:blank).
I start thinking we need some kind of a $csp modifier able to set custom content security policy for an URL.
For instance:
||example.org^$csp="frame-src self"
It's crucial, though, that it should only strengthen existing policy. Which is easy to achieve if we'd add an additional CSP header.
Thoughts?
For instance:
||example.org^$csp="frame-src self"
This is pretty much what I implemented on my side (csp=..., not committed yet), except without the quotes (no need). This allows to inject whatever CSP policy we want, while never ever relaxing existing ones (when appended using ,).
This is pretty much what I implemented on my side (not committed yet), except without the quotes (no need)
We'd like to support it on our side as well, let's have a common syntax for this kind of rules then.
Why no quotes btw? It'll make it much harder to parse a rule with more than one modifier. For instance, subdocument and third-party might be used along with this one (in theory, but anyway).
Ok, it seems that I am wrong, CSP instructions cannot contain a comma, so it should be relatively easy to parse something like ||example.org^$csp=policy,subdocument,domain=example.com
Are you planning to support additional modifiers in this type of rules?
CSP instructions cannot contain a comma
Yes, that's the reason I chose to leave out quotes. The comma can be used to _combine_ CSP policies, but I rather never have them used in filter options -- fits nicely with existing use of comma to separate filter options.
The nice thing with comma to separate distinct sets of policies when combined with existing CSP policies is that it won't cause spurious CSP reports, each comma-separated set gets its own report policy.
Currently all the following modifiers are supported when used with csp=: third-party, domain=, important, badfilter.
Additionally, exception filters for csp= can be crafted two ways:
csp= match, i.e. @@||example.com/nice$csp=frame-src 'none' will cancel _only_ whatever filter tries to inject _exactly_ a csp=frame-src 'none' filter, but not a csp=frame-src 'self' filter; OR@@...$csp will cancel all CSP injection for URLs which match the filter.All this required refactoring on my side, as the semantic for csp= filters is that _all_ matching filters must be found (and furthermore applied according to important and @@), while normal filters only the first hit is returned.
The nice thing with comma to separate distinct sets of policies when combined with existing CSP policies is that it won't cause spurious CSP reports, each comma-separated set gets its own report policy.
Isn't it the same in the case of an additional CSP header?
@gorhill overall, I love the idea and the features you want it to have.
I've come up with a formal description of the csp modifier:
https://github.com/AdguardTeam/AdguardBrowserExtension/issues/685
Could you please check are we on the same page?
Isn't it the same in the case of an additional CSP header?
I guess it is the same, however I prefer to append to existing CSP header because of this passage in documentation:
A server MUST NOT send more than one HTTP header field named
Content-Security-Policywith a given resource representation.A server MAY send different
Content-Security-Policyheader field values with different representations of the same resource or with different resources.
I have a bit of a problem parsing the meaning of this, so I went with what I thought was the safest approach, which is to append and use comma as separator.
The $csp filter option is an interesting idea, I've opened issue 5241 to start a discussion with Wladimir and Sebastian about adding it to Adblock Plus as well.
@kzar 馃憤
I guess it is the same, however I prefer to append to existing CSP header because of this passage in documentation:
Huh, I wonder what this resource representation thing means. For instance, dropbox.com sends two CSP headers:
content-security-policy: script-src 'unsafe-eval' https://www.dropbox.com/static/compiled/js/ https://www.dropbox.com/static/javascript/ https://www.dropbox.com/static/api/ https://cfl.dropboxstatic.com/static/compiled/js/ https://www.dropboxstatic.com/static/compiled/js/ https://cfl.dropboxstatic.com/static/previews/ https://www.dropboxstatic.com/static/previews/ https://cfl.dropboxstatic.com/static/javascript/ https://www.dropboxstatic.com/static/javascript/ https://cfl.dropboxstatic.com/static/api/ https://www.dropboxstatic.com/static/api/ 'unsafe-inline' 'nonce-bKDizX/Kbtm0495WF9jC' ; default-src 'none' ; worker-src https://www.dropbox.com/static/serviceworker/ blob: ; style-src https://* 'unsafe-inline' 'unsafe-eval' ; connect-src https://* ws://127.0.0.1:*/ws ; child-src https://www.dropbox.com/static/serviceworker/ blob: ; form-action 'self' https://dl-web.dropbox.com/ https://photos.dropbox.com/ https://accounts.google.com/ https://api.login.yahoo.com/ https://login.yahoo.com/ ; base-uri 'self' api-stream.dropbox.com showbox-tr.dropbox.com ; img-src https://* data: blob: ; frame-src https://* carousel://* dbapi-6://* dbapi-7://* dbapi-8://* itms-apps://* itms-appss://* ; object-src https://cfl.dropboxstatic.com/static/ https://www.dropboxstatic.com/static/ 'self' https://flash.dropboxstatic.com https://swf.dropboxstatic.com https://dbxlocal.dropboxstatic.com ; media-src https://* blob: ; font-src https://* data:
content-security-policy: script-src 'nonce-bKDizX/Kbtm0495WF9jC' 'nonce-wTJ0bGY/hQQlxU0EVOpm' 'unsafe-eval' 'strict-dynamic'
edit: asked a question: https://github.com/w3c/webappsec-csp/issues/215
So, after all:
@mapx-, @kzar
About https://issues.adblockplus.org/ticket/5291, what I don't see being done in the comment is:
Yes, I have seen ABP grow to past 340 MB, and even leaving the browser on idle did not cause the memory to be garbage-collected. However, after accomplishing the above steps, there was memory being garbage-collected, and the memory snapped back to expected levels (keeping in mind fragmentation, js engine internals, etc.).
Any report of high memory usage should always be done _after_ the steps above, including the equivalent ones on Firefox.
Thanks that's a good tip, I will give it a try. It's just weird (to me at least) that garbage collection is not happening automatically. We can hardly expect users to trigger it manually :(
Why does the chrome store say it is corrupted now ? :(
@hemantgoyal That is a Chrome hash function bug. See https://github.com/gorhill/uBlock/issues/2720 (already fixed with a bit of padding).
Hey guys, I've recently stumbled upon an interesting adblocking circumvention technique used by Yandex.
The thing is that they use shadow DOM to circumvent element hiding rules:
https://uploads.adguard.com/up04_5pkb0_Yandeks.png
The open root is used in the screenshot so we can get inside with a /deep/, and when a closed root is used, /deep/ cannot help us. Anyway, all the shadow piercing selectors are deprecated and will be eventually removed so we have a problem here.
Possible solution (rather ugly though):
::shadow pseudo-selector to pierce inside open roots.attachShadow and force all shadow roots to be open.Have you already faced anything like that? Thoughts?
@ameshkov
I added you here: https://issues.adblockplus.org/ticket/5318
and
https://issues.adblockplus.org/ticket/5302
perhaps it's about the same yandex stuff
@mapx- thank you! Yeah, it's been a while since they began their crusade and both issues are relevant.
What bothers me is that the "closed shadow root" approach seems to be a universal way to avoid elements hiding and even user stylesheets won't help us defeat it once Chrome stops supporting /deep/ and ::shadow.
The /deep/ issue has been raised before. It's being deprecated as a valid CSS selector component in a CSS rule, but will still be valid as a CSS selector in querySelector[All] call. My understanding.
So currently, not an issue with Firefox I presume (does not support shadow stuff yet). An issue with Chrome, but can be worked around by manually hiding through querySelectorAll.
@gorhill
So currently, not an issue with Firefox I presume (does not support shadow stuff yet). An issue with Chrome, but can be worked around by manually hiding through querySelectorAll.
That's basically what I meant -- support either shadow or /deep/ "polyfills" just like we do with :has.
Good news is that /deep/ is able to pierce inside closed shadow roots so my the second point in my comment is redundant.
just like we do with
:has.
I just realized we can probably already just use :has for filter with /deep/?
example.com##div.container:has(/deep/ .aq)
Would this work now?
I just realized we can probably already just use :has for filter with /deep/?
We don't yet support it (but we definitely will), but this is a partial solution anyway.
For instance, in Yandex case they shadow contains legit elements as well so we need something like example.org##div /deep/ span:has(.banner)
@ameshkov
The "fastest" way to perform a dynamic script injection is to use the onCommitted listener
I'm experimenting with this and this works fine so far on both Chromium and Firefox. I see 10-20ms gain in how earlier the scriptlets are injected (using tabs.executeScript), though when I measure with the number of scripts already handled (document.scripts.length), I can't see gain so far for the little I have tested.
Anyway, I want to ask why did you chose to go through messaging to inject the scriptlets rather than injecting directly using tabs.executeScript?
@gorhill
Anyway, I want to ask why did you chose to go through messaging to inject the scriptlets rather than injecting directly using tabs.executeScript?
As I recall, we compared both and didn't see any serious difference. Actually, in future updates, we'll migrate to tabs.insertCSS in light of the coming user-agent styles priority improvement.
@ameshkov regarding Shadow DOM, we have user style sheets in Chromium now (works on Canary) along with the cssOrigin option to tabs.insertCSS. I'm also working on allowing extensions to access closed shadow roots. We should be in a good place soon.
Hi @mjethani! I'm actually keeping an eye on all the pull requests you're pushing to Chromium, and you're doing a great job, thank you!
Hey guys, coming at you with a new modifier idea:
https://github.com/AdguardTeam/AdguardBrowserExtension/issues/961
I suppose it can benefit all the privacy-oriented subscriptions so we're planning to implement it in one of the future updates.
@ameshkov I don't see that much privacy value in dealing with cookies alone given that data can be stored in other local storage such as localStorage, indexedDB, etc.Blocking 3rd-party cookies in browser settings should be the first step for any privacy conscious person -- this also takes care of all local storages.
Extending this modifier to handle localStorage sounds useful indeed, and it can be done without changing the modifier syntax.
However, I've never seen indexedDB used for tracking purpose.
Most helpful comment
@ameshkov regarding Shadow DOM, we have user style sheets in Chromium now (works on Canary) along with the
cssOriginoption totabs.insertCSS. I'm also working on allowing extensions to access closed shadow roots. We should be in a good place soon.