[Done] More flexible anchors

Various discussions related to Adblock Plus development
Wladimir Palant

[Done] More flexible anchors

Post by Wladimir Palant »

A suggestion that came from the Mozilla Russia forum - make anchors more flexible so that one can omit the protocol and subdomains. It boils down to filters like "||example.com/foo.gif" that would match "http://example.com/foo.gif", "https://bar.example.com/foo.gif" but not "http://oneexample.com/foo.gif" or "http://redirect.com/?http://example.com/foo.gif". Sounds like a good generalization of the anchors, would help with the Malware Domains list for example.

Internally, a filter like "||example.com/" would be translated into /^[\w\-]+:\/\/(?:[^\/]+\.)?example\.com\//

Thoughts, opinions?
Last edited by Wladimir Palant on Thu May 07, 2009 12:23 pm, edited 4 times in total.
Fox
Posts: 300
Joined: Sat Jun 10, 2006 3:05 pm
Location: Finland

Post by Fox »

I would like that.

Is filter like:
@@||example.com/

Then site whitelisting rule or item whitelisting rule.
Wladimir Palant

Post by Wladimir Palant »

@Fox: Site whitelisting rules should specify $document flag explicitly. It is done automatically for filters starting with http:// but that's mostly for backwards compatibility - I would rather not make it more complicated.
MonztA
ABP Developer
Posts: 3957
Joined: Mon Aug 14, 2006 12:18 am
Location: Germany

Post by MonztA »

Fox wrote:I would like that.
I second that. :)
Wladimir Palant

Post by Wladimir Palant »

Just got a mail asking for improvement of the "Disable on foo.com" menu item - it shouldn't require disabling on each subdomain. So maybe add a third option there: "Disable on *.foo.com" that will add the filter "@@||foo.com/$document", what do you think? It should offer disabling on the effective first-level domain meaning *.foo.com if the user is on bar.foo.com and *.foo.co.uk if he is on bar.foo.co.uk.

I dislike adding more options there but this new option really cannot replace any of the existing options.
Ervin

Post by Ervin »

How about |protocol|domain|path? This would even allow subdomain detection. So

Code: Select all

@@|https||
Would unblock any https://* URLs,

Code: Select all

||example.com|
would block any domain and subdomain of example.com, and

Code: Select all

|||*banner*
would block any URL that contains "banner" but not in the domain or protocol.

One thing remains. What if nonstandard port is used like http://www.example.com:8080/index.html
Ervin

Post by Ervin »

Ervin wrote:How about |protocol|domain|path?
Or "|protocol|domain|port|path". That would solve the port problem for an extra pipe.
Wladimir Palant

Post by Wladimir Palant »

Ervin, I think you are overcomplicating things now. I don't see a real use case for your suggestion.
Ervin

Post by Ervin »

Wladimir Palant wrote:Ervin, I think you are overcomplicating things now. I don't see a real use case for your suggestion.
Agreed, but I do think this syntax would be much cleaner. If you think in anchors, each pipe character would mean a specific boundary in the URL. Also this is a superset of the original suggestion. But this was just my thought.
User avatar
Stupid Head
Posts: 214
Joined: Sat Aug 26, 2006 8:11 pm
Location: USA

Post by Stupid Head »

Fox wrote:I would like that.
Me too.
What, me worry?
Ares2
Posts: 1275
Joined: Fri Feb 15, 2008 12:47 pm

Post by Ares2 »

Is this on the ABP 1.1 to-do list? :)
Wladimir Palant

Post by Wladimir Palant »

Yes, it is.
Fox
Posts: 300
Joined: Sat Jun 10, 2006 3:05 pm
Location: Finland

Post by Fox »

would it be good idea to have later || too.
later || would mean
: and /
then filter like:
.example.com||
would block these:
.example.com:8080
.example.com:81
.example.com/

But not:
.example.com.au
Wladimir Palant

Post by Wladimir Palant »

Done: http://hg.mozdev.org/adblockplus/rev/cf31fbc930ab

@Fox: Not sure whether this should be done, will think about it.
Wladimir Palant

Post by Wladimir Palant »

I was too eager to mark this as "done". There is still work left:

* "Disable on foo.com" should create a filter with flexible anchor
* Filter composer should be able to use flexible anchors (should that be the default for all suggestions?)
* Filter export from preferences should set ABP version to 1.1 if flexible anchors are found

Concerning flexible anchors at the end of the filter, I thought that the following definition would make sense:

foo|| means that "foo" should either be at the end of the address or it should be followed by a separator character. Separator characters are all characters but letters (need to recognize international letters somehow), digits, underscore, period, -, %. So || will be translated into something like:

([^\w\.\-%]|$)

So "||example.com||" will match "http://example.com/foo" and "http://example.com:1234/foo" but not "http://example.company.com/" and not "http://example.com.com/". Similarly "||example.com/foo||" will match "http://example.com/foo" and "http://example.com/foo/bar" but not "http://example.com/foobar".

Opinions?
Locked