User.js: Request Control Filters

Created on 21 Jun 2017  路  42Comments  路  Source: arkenfox/user.js

This is Request Control | GitHub
This is our Request Control Filters Wiki

Post your Request Control Filter-fu here

Note: Examples below are for discussion / testing - use/test at your leisure, post feedback. As really cool filters emerge, we will put them in the wiki

Most helpful comment

example to skip youtu.be:
pattern: scheme: http/https host: youtu.be path: *
types: Document
action: Redirect
redirect to: https://www.youtube.com/watch?v={pathname:1}

example url: https://youtu.be/nqbUkThGlCo

Request Control supports direct access to named parameters
The named parameter pathname in the example link is /nqbUkThGlCo and {pathname:1} strips the slash using substring extraction

All 42 comments

NOTE: ga_ stuff might need XHR type also
Pure URL extension replacement:

Pattern: Any URL
Types: Document, Embedded document (not sure for second one)
Action: Filter
Filter URL: Off
Trim URL Parameters: ref_, utm_source, utm_medium, utm_term, utm_content, utm_campaign, utm_reader, utm_place, ga_source, ga_medium, ga_term, ga_content, ga_campaign, ga_place, yclid, _openstat, fb_action_ids, fb_action_types, fb_ref, fb_source, action_object_map, action_type_map, action_ref_map, ws_ab_test, btsid, algo_expid, algo_pvid, sid, utm_name, utm_cid, utm_reader, utm_viz_id, utm_pubreferrer, utm_swu, icid, _hsenc, _hsmi, mkt_tok, sr_share, vero_conv, vero_id, nr_email_referer, ncid

Samples:
http://bigpicture.ru/?p=431513&utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+bigpictures+%28%D0%9D%D0%9E%D0%92%D0%9E%D0%A1%D0%A2%D0%98+%D0%92+%D0%A4%D0%9E%D0%A2%D0%9E%D0%93%D0%A0%D0%90%D0%A4%D0%98%D0%AF%D0%A5%29

UPDATE:
Added @Atavic suggested "Trim URL Parameters"

Redirect to URL without REF tracking:

Pattern: Any URL
Types: Document, Embedded document (not sure for second one)
Action: Redirect
Redirect To: {href/(.?)ref=./$1} {href/(.*\/)ref=.*/$1}

Samples:
https://www.amazon.com/AmazonBasics-Type-C-USB-Male-Cable/dp/B01GGKYQ02/ref=sr_1_1?s=amazonbasics&srs=10112675011&ie=UTF8&qid=1489067885&sr=8-1&keywords=usb-c

Remove the crap and possible tracking over URL manipulation after images and CSS:

Pattern scheme: http/https
Pattern host: *
Pattern path: *.jpg?*, *.gif?*, *.png?*, *.svg?*, *.css?*
Types: Document, Embedded document (not sure for second one), Stylesheet, Image
Action: Filter
Filter URL: Off
Trim URL Parameters: Trim all

UPDATE:
Added Pattern path: *.css?*
Stylesheet breaks this page https://technet.microsoft.com/en-us/library/2009.07.cableguy.aspx

Added Types: Document, Embedded document

Samples:
http://www.24ur.com/

Sample that doesn't work, but it should... don know why, yet: UPDATE: It works now.
https://thechive.files.wordpress.com/2017/06/dd0523324267789bb8beecc5d7914970.jpg?quality=85&strip=info&w=600

BRAINING: maybe there is same method to be added also for Font and Media types... will see.

UPDATE: Use "Skip Redirect" from AMO... its way better.
Facebook redirect without visiting facebook:

Pattern scheme: http/https
Pattern host: l.facebook.com
Pattern path: *u=*
Types: Document, Embedded document (not sure for second one)
Action: Filter
Filter URL: On

Sample:
https://l.facebook.com/l.php?u=https%3A%2F%2Fwww.fsf.org%2Fcampaigns%2F&h=ATP1kf98S0FxqErjoW8VmdSllIp4veuH2_m1jl69sEEeLzUXbkNXrVnzRMp65r5vf21LJGTgJwR2b66m97zYJoXx951n-pr4ruS1osMvT2c9ITsplpPU37RlSqJsSgba&s=1

Funny... I have to look into it too.
Anyway, I have updated for CSS too... see change up.
One site is particulary full of this crap to test... see updated post.

UPDATE: import/export is done.
The author is working now on import/export. ;)

@Thorin-Oakenpants - if you wish to clean the thread, you can delete my last 3 posts (including this one) and your "didn't work" too, since its resolved.

Cheers

Here some Google's Urchin Tracking Modules that aren't listed in Trim URL Parameters list:

utm_name
utm_cid
utm_reader
utm_viz_id
utm_pubreferrer
utm_swu

Also, url-tracking-stripper has a few more for other trackers:

ICID
icid
_hsenc
_hsmi
mkt_tok
sr_share
vero_conv
vero_id
nr_email_referer
ncid

Thanks @crssi !! :+1:

a couple nits if you don't mind...

  • the ga_ stuff probably needs to include XHR
  • {href/(.?)ref=./$1} doesn't seem right. I'm pretty sure this always results in a single h (from "http/https"), right? (.?)
    and wouldn't it (if fixed) break something like http://foo.bar/redirect?href=bar.foo/index.html

to give you an example, this is what I came up with for a bing-redirect:
{search/^\?.*&?q=(.+?)(&.*|#.*|$)/https://www.ixquick.eu/do/dsearch?query=$1&cat=web&pl=opensearch&language=english} - it covers everything I could think of, namely:

and even then, I'm sure I missed something - regexes are almost impossible to get right

another one from my small collection:

  • Any URL CSP report Block

You are right. Need to rethink the REF one, it was fishy to me also and I don't like it at all.... will put a "Under construction" note on that post.

The ga_ stuff, should we separate into another filter extended with XHR or just enable the XHR on the current one. I will enable here now and see if I will get any breakage... will also put a note in that post.

I have another "nasty" one, which could replace "Skip Redirect" extension, but it is also fishy... for now I am using Redirector, since I can use regex (not that I like it, but can't see better way.

I can place the nasty one in another post together with the Redirector approach for your critics?

Have you installed and looked at Redirector yet?

nope

Does Request Control offer anything that Redirector doesn't or vice versa?

IDK, you tell me :) I'll try it someday. My way too complex bing regex is probably not necessary. It looks like the addon strips additional parameters automatically. I'll try to come up with a simpler solution

I already edited my post dude - chillax mate :)
I'll probably end up using both

Redirector has some problems, when clicking URL from external program it doesn't always work as intended.
Another thingy is that Redirector parses input only by one filter (just my observation) and doesn't pass the result over filters again unitl nothing left to filter (it happens that you want to filter out more than only one thing).
Otherwise it has also exclusions which is great and better string manipulation over regex.
Both are very good.

@Thorin-Oakenpants do you have any sample for ga_ stuff, so I can dig in?
After tomorrow I will be offline for about 5 days.

Updated RedirectTo

does not cover @Thorin-Oakenpants sample: http://foo.bar/redirect?href=bar.foo/index.html

UPDATE: Use "Skip Redirect" from AMO... its way better.
Now a nasty one... trying to mimic part of "Skip redirect"
NOTE: this one can and will make some login breakage... for tests and brainstorming only
Does not interfere any of rules posted on this page.

It consists 2 filter rules.
First one does the skipping and the second one whitelists login pages to avoid breakage.

Pattern: Any URL
Types: Document
Action: Filter
Filter URL: On

Pattern scheme: http/https
Pattern host: *
Pattern path: *signin*, *signout*, *login*, *logout*, *logon*, *logoff*, *auth*, *account*, *eBayISAPI.dll*, *ServiceLogin*, *ServiceLogout*, *AccountChooser* (maybe you can add also *option*, but I don't like it)
Types: Document
Action: Whitelist

Samples:
Working redirect
https://outgoing.prod.mozaws.net/v1/b928e4237edbbdd2646a3971d2e6b514aee033c10f3f4c49415bf93096405f38/http%3A//www.google.com/chrome/%3Fi-would-rather-use-firefox=http%253A%252F%252Fwww.mozilla.org/

www.google.com/chrome/?i-would-rather-use-firefox=http%3A%2F%2Fwww.mozilla.org/

Working login pages
https://signin.ebay.de/ws/eBayISAPI.dll?SignIn&UsingSSL=1&pUserId=&co_partnerId=2&siteid=77&ru=http%3A%2F%2Fmy.ebay.de%2Fws%2FeBayISAPI.dll%3FMyEbayBeta%26MyEbay%3D%26gbh%3D1%26guest%3D1&pageType=3984

https://www.amazon.com/ap/signin?_encoding=UTF8&openid.assoc_handle=usflex&openid.claimed_id=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0%2Fidentifier_select&openid.identity=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0%2Fidentifier_select&openid.mode=checkid_setup&openid.ns=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0&openid.ns.pape=http%3A%2F%2Fspecs.openid.net%2Fextensions%2Fpape%2F1.0&openid.pape.max_auth_age=0&openid.return_to=https%3A%2F%2Fwww.amazon.com%2Fgp%2Fyourstore%2Fcard%3Fie%3DUTF8%26ref_%3Dcust_rec_intestitial_signin

https://manage.autodesk.com/service/authentication/login?returnUrl=https%3A%2F%2Fmanage.autodesk.com%2Fcep%2F

https://account.xiaomi.com/pass/serviceLogin?callback=http%3A%2F%2Fglobal.mi.com%2Fen%2Flogin%2Fcallback%3Ffollowup%3Dhttp%253A%252F%252Fwww.mi.com%252Fen%252F%26sign%3DZDk4YmY0MTRkMThmODExYTE0MDljYWRmYzczZmNjOGZjNDAzNzU0Mg%2C%2C&sid=mi_overseaen&_locale=en

https://login.live.com/login.srf?wa=wsignin1.0&rpsnv=13&ct=1495461484&rver=6.7.6643.0&wp=MBI_SSL_SHARED&wreply=https:%2F%2Fmail.live.com%2Fdefault.aspx%26lc%3D1033%26id%3D64855%26mkt%3Den-us%26cbcxt%3Dmai%26lc%3D1033%26id%3D64855%26mkt%3Den-us%26cbcxt%3Dmai%26lc%3D1033%26id%3D64855%26mkt%3Den-us%26cbcxt%3Dmai%26lc%3D1033%26id%3D64855%26mkt%3Den-us%26cbcxt%3Dmai&lc=1033&id=64855&mkt=en-us&cbcxt=mai

Currently broken login pages
https://login.leagueoflegends.com/?region=eune&lang=en_PL&redirect_uri=http%3A%2F%2Feune.leagueoflegends.com%2F
I can't figure out how to match this one since "login" is part of hostname and not path and Request Control doesn't cover this... here Redirector is better.

Crap... at least "path" definition is case sensitive. I have placed an issue about that.

UPDATE: Case sensitive "bug" is actually "deal breaker" until corrected. Shame, started to love this one. :(

:-1: Redirector always loads the unmodified address.

Maybe we should add Request Control to Appendix B: Firefox-Add-ons?

^^ Ahh, this now perfectly explains my observation about Redirector and what I didn't like about it.
But how then uBo, uMatrix and Request Controls are dealing perfectly in those cases.

URL path definition is case sensitive

The absolute URL path: http://example.com/data/ is not case sensitive, as per RFC 3986.

While all that comes after can be case sensitive, depending on web server settings, OS...

Everything after the RFC Standard is unclear.

@crssi wrote:

do you have any sample for ga_ stuff

any site that uses GoogleAnalytics, fe ghacks. GA should already be blocked by uBO or uMatrix or whatnot and I'm not sure we really need a filter for that stuff.

earthlng wrote

My way too complex bing regex is probably not necessary. It looks like the addon strips additional parameters automatically

Yeah, it doesn't, at least not for "Action: Redirect". Redirector doesn't make it easier either.
But I realized I don't need to account for anchors because they are not part of the "search" parameter, which results in this slightly adjusted bing-redirect:

  • old:
    {search/^\?.*&?q=(.+?)(&.*|#.*|$)/https://www.ixquick.com/do/dsearch?query=$1&cat=web&pl=opensearch&language=english}
  • new:
    {search/(^\?|.*&)q=(.+?)(&.*|$)/https://www.ixquick.com/do/dsearch?query=$2&cat=web&pl=opensearch&language=english}
  • alternative format:
    https://www.ixquick.com/do/dsearch?query={search/(^\?|.*&)q=(.+?)(&.*|$)/$2}&cat=web&pl=opensearch&language=english

... pick whichever format you prefer.

edit: fuck! the ^\?.*&?q= part is flawed, for example: ?testq=wrong without a q=right, however it works for ?testq=disney&q=porn but not for ?q=porn&testq=disney, ie if you want porn instead of disney :)
got it! (^\?|.*&)q=

edit2:
for Redirector:
https?://www.bing.com/search(\?|.*&)q=(.+?)(&.*|#.*|$) - this one "needs" the anchor part again!
Redirect to: https://www.ixquick.com/do/dsearch?query=$2&cat=web&pl=opensearch&language=english

please try to break this one y'all ;)

We need some really generic ones, but I have no idea how bad the breakage could get.

I would stay away from generic ones if you care about efficiency and no breakage.

Does Request Control offer anything that Redirector doesn't or vice versa?

Redirector

  • pros

    • example result

    • control over processing -> "Process matches"

    • very detailed console output

    • simpler terms (iframe instead of embedded document)

    • import/export rules

    • Exclude Pattern (you can probably achieve the same in RC with a bit of regex-fu)

    • custom description

  • cons

    • very detailed console output (performance impact, log "pollution", no pref to disable it?)

    • simpler terms ("other" instead of CSP report (which I guess is part of "other"), etc)

    • no direct access to only parts of the url (ie RC's "parameters")

    • doesn't support multiple domains/paths in one rule

    • no "TLD's" support (fe. see default google rule in RC)

    • no "block" request

    • no indicator in the UI for redirects

Redirector's author said the addon is in "maintenance mode" and he won't add new features.
RC's author still seems to be very actively maintaining the addon and is open to add new features.

RC does some "magic" in the background which is great (and seems to work) but not very transparent.
fe. it strips additional parameters (&xyz=foo&bar=etc) with "Filter" rules and does urldecoding if necessary. I'd personally prefer a bit more control over this, especially the encoding/decoding stuff.
Maybe we can lobby him/her to add that (with a default "auto" option for the current behavior)

edit: Redirector allows/requires to order the rules but idk how useful that is. It probably makes it more complicated if you have a lot of rules

  1. https://github.com/ghacksuserjs/ghacks-user.js/issues/149#issuecomment-310175387

This appears to be breaking some internal Amazon links (such as "Warehouse Deals" for example),
presumably it`s the ref= part.

  1. https://github.com/ghacksuserjs/ghacks-user.js/issues/149#issuecomment-310431067

With regard to this, how exactly are these "2" filter rules applied ?
Separately ?, must they be in a specific order ?

This stuff gives me a headache at the best of times - intriguing stuff though !

OT: @Thorin-Oakenpants Good to know, looks like im not John Cena!

FYI atm there's a bug that a redirected Document still gets added to the history. However the request doesn't touch the network. (that was fixed in 54 and also backported to ESR here)
re: https://www.reddit.com/r/firefox/comments/6j6w0v/how_to_automatically_clean_up_urls_by_removing/djc7w8v/

The *.jpg?*, *.png?* ... image filters break images on www.kickstarter.com.

Try whitelisting https://ksr-ugc.imgix.net/

Will keep eye on this addon, since it has a lot of potential, but will hold until case insesitive isn't sorted out.

@GitCurious REF is tricky and WILL make breakage.
There are no orders in RC.
The first rule takes the last URL in the URL (damn if this makes sense :))
And the second skips this action when you hit some login pages.

There are also some other issues. I have placed an Issue on RC github.

Sure you can. Its just a JSON.
This is the same:
[{"pattern":{"scheme":"*","host":["www.imdb.com"],"path":["title/*","name/*","character/*"]},"types":["main_frame"],"action":"filter","active":true,"paramsFilter":{"values":["ref_"],"pattern":"ref_"},"skipRedirectionFilter":true}]

Damn cool... how do you do that?

Ah, ok, I see now... thx :)

@crssi Google has &biw= and others, see RemoveGoogleTracking.

Thank you @Atavic. Will take a look into in the next days

For redirect skipping RC is no match to Skip Redirect (WE), especially now where author implemented same domain detection... see https://github.com/sblask/webextension-skip-redirect/issues/30

I will not try to mimic the same functionality on RC anymore.

For basic tracking removal the Link Cleaner does a nice job, but RC can be used as a supplement for more advanced rules or for additional tracking options removal that are not covered in LC.

@ghacksuserjs from RemoveGoogleTracking:

'biw', // offsetWidth
'bih', // offsetHeight

Apparently related to screen fingerprint.

@Atavic
To to be sure... should I add to https://github.com/ghacksuserjs/ghacks-user.js/issues/149#issuecomment-310173763 the following:
biw, bih, ei, sa, ved, source, prmd, bvm, bav, psi, stick, dq, ech, gs_gbg, gs_rn, cp, scroll, vet, yv, ijn, iact, forward, ndsp, csi, tbnid, pbx, dpr, pf, gs_rn, gs_mss, pq, cp, oq, sclient, gs_l, aqs, psi

Observation: From what I see on the source code of RemoveGoogleTracking, its more complex than just to remove those parameters.

I have redesigned some rules and dropped some from the scratch and have issued few issues to @tumpio... specially this one.
When some more is known about the issue I will close this topic and open a new topic (referenced to this one) and post from scratch.

@Thorin-Oakenpants
If you agree I would like to close this topic, since @tumpio is not active almost 2 months and there is better approach to deal with skipping redirect and url cleaning.
I can open new "post" with all together solution with setup I am using for skip redirect and url cleaning in the next few days?

example to skip youtu.be:
pattern: scheme: http/https host: youtu.be path: *
types: Document
action: Redirect
redirect to: https://www.youtube.com/watch?v={pathname:1}

example url: https://youtu.be/nqbUkThGlCo

Request Control supports direct access to named parameters
The named parameter pathname in the example link is /nqbUkThGlCo and {pathname:1} strips the slash using substring extraction

My love to all of you

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Thorin-Oakenpants picture Thorin-Oakenpants  路  5Comments

Thorin-Oakenpants picture Thorin-Oakenpants  路  3Comments

grauenwolfe picture grauenwolfe  路  7Comments

Thorin-Oakenpants picture Thorin-Oakenpants  路  4Comments

TerkiKerel picture TerkiKerel  路  4Comments