Page 1 of 2
[Done] More flexible anchors
Posted: Fri Jan 02, 2009 11:19 pm
by Wladimir Palant
A suggestion that came from the Mozilla Russia forum - make anchors more flexible so that one can omit the protocol and subdomains. It boils down to filters like "||example.com/foo.gif" that would match "
http://example.com/foo.gif", "
https://bar.example.com/foo.gif" but not "
http://oneexample.com/foo.gif" or "
http://redirect.com/?http://example.com/foo.gif". Sounds like a good generalization of the anchors, would help with the Malware Domains list for example.
Internally, a filter like "||example.com/" would be translated into /^[\w\-]+:\/\/(?:[^\/]+\.)?example\.com\//
Thoughts, opinions?
Posted: Mon Jan 05, 2009 7:14 pm
by Fox
I would like that.
Is filter like:
@@||example.com/
Then site whitelisting rule or item whitelisting rule.
Posted: Mon Jan 05, 2009 7:33 pm
by Wladimir Palant
@Fox: Site whitelisting rules should specify $document flag explicitly. It is done automatically for filters starting with http:// but that's mostly for backwards compatibility - I would rather not make it more complicated.
Posted: Mon Jan 05, 2009 7:39 pm
by MonztA
Fox wrote:I would like that.
I second that.
Posted: Fri Jan 09, 2009 9:07 pm
by Wladimir Palant
Just got a mail asking for improvement of the "Disable on foo.com" menu item - it shouldn't require disabling on each subdomain. So maybe add a third option there: "Disable on *.foo.com" that will add the filter "@@||foo.com/$document", what do you think? It should offer disabling on the effective first-level domain meaning *.foo.com if the user is on bar.foo.com and *.foo.co.uk if he is on bar.foo.co.uk.
I dislike adding more options there but this new option really cannot replace any of the existing options.
Posted: Wed Jan 21, 2009 11:50 am
by Ervin
How about |protocol|domain|path? This would even allow subdomain detection. So
Would unblock any https://* URLs,
would block any domain and subdomain of example.com, and
would block any URL that contains "banner" but not in the domain or protocol.
One thing remains. What if nonstandard port is used like
http://www.example.com:8080/index.html
Posted: Wed Jan 21, 2009 12:02 pm
by Ervin
Ervin wrote:How about |protocol|domain|path?
Or "|protocol|domain|port|path". That would solve the port problem for an extra pipe.
Posted: Wed Jan 21, 2009 1:17 pm
by Wladimir Palant
Ervin, I think you are overcomplicating things now. I don't see a real use case for your suggestion.
Posted: Wed Jan 21, 2009 2:06 pm
by Ervin
Wladimir Palant wrote:Ervin, I think you are overcomplicating things now. I don't see a real use case for your suggestion.
Agreed, but I do think this syntax would be much cleaner. If you think in anchors, each pipe character would mean a specific boundary in the URL. Also this is a superset of the original suggestion. But this was just my thought.
Posted: Sat Jan 24, 2009 8:45 pm
by Stupid Head
Fox wrote:I would like that.
Me too.
Posted: Fri May 01, 2009 2:39 am
by Ares2
Is this on the ABP 1.1 to-do list?
Posted: Fri May 01, 2009 1:29 pm
by Wladimir Palant
Yes, it is.
Posted: Fri May 01, 2009 1:38 pm
by Fox
would it be good idea to have later || too.
later || would mean
: and /
then filter like:
.example.com||
would block these:
.example.com:8080
.example.com:81
.example.com/
But not:
.example.com.au
Posted: Tue May 05, 2009 8:14 am
by Wladimir Palant
Done:
http://hg.mozdev.org/adblockplus/rev/cf31fbc930ab
@Fox: Not sure whether this should be done, will think about it.
Posted: Wed May 06, 2009 9:05 am
by Wladimir Palant
I was too eager to mark this as "done". There is still work left:
* "Disable on foo.com" should create a filter with flexible anchor
* Filter composer should be able to use flexible anchors (should that be the default for all suggestions?)
* Filter export from preferences should set ABP version to 1.1 if flexible anchors are found
Concerning flexible anchors at the end of the filter, I thought that the following definition would make sense:
foo|| means that "foo" should either be at the end of the address or it should be followed by a separator character. Separator characters are all characters but letters (need to recognize international letters somehow), digits, underscore, period, -, %. So || will be translated into something like:
([^\w\.\-%]|$)
So "||example.com||" will match "
http://example.com/foo" and "
http://example.com:1234/foo" but not "
http://example.company.com/" and not "
http://example.com.com/". Similarly "||example.com/foo||" will match "
http://example.com/foo" and "
http://example.com/foo/bar" but not "
http://example.com/foobar".
Opinions?