[Done] More flexible anchors
[Done] More flexible anchors
A suggestion that came from the Mozilla Russia forum - make anchors more flexible so that one can omit the protocol and subdomains. It boils down to filters like "||example.com/foo.gif" that would match "http://example.com/foo.gif", "https://bar.example.com/foo.gif" but not "http://oneexample.com/foo.gif" or "http://redirect.com/?http://example.com/foo.gif". Sounds like a good generalization of the anchors, would help with the Malware Domains list for example.
Internally, a filter like "||example.com/" would be translated into /^[\w\-]+:\/\/(?:[^\/]+\.)?example\.com\//
Thoughts, opinions?
Internally, a filter like "||example.com/" would be translated into /^[\w\-]+:\/\/(?:[^\/]+\.)?example\.com\//
Thoughts, opinions?
Last edited by Wladimir Palant on Thu May 07, 2009 12:23 pm, edited 4 times in total.
Just got a mail asking for improvement of the "Disable on foo.com" menu item - it shouldn't require disabling on each subdomain. So maybe add a third option there: "Disable on *.foo.com" that will add the filter "@@||foo.com/$document", what do you think? It should offer disabling on the effective first-level domain meaning *.foo.com if the user is on bar.foo.com and *.foo.co.uk if he is on bar.foo.co.uk.
I dislike adding more options there but this new option really cannot replace any of the existing options.
I dislike adding more options there but this new option really cannot replace any of the existing options.
How about |protocol|domain|path? This would even allow subdomain detection. So
Would unblock any https://* URLs,
would block any domain and subdomain of example.com, and
would block any URL that contains "banner" but not in the domain or protocol.
One thing remains. What if nonstandard port is used like http://www.example.com:8080/index.html
Code: Select all
@@|https||
Code: Select all
||example.com|
Code: Select all
|||*banner*
One thing remains. What if nonstandard port is used like http://www.example.com:8080/index.html
Agreed, but I do think this syntax would be much cleaner. If you think in anchors, each pipe character would mean a specific boundary in the URL. Also this is a superset of the original suggestion. But this was just my thought.Wladimir Palant wrote:Ervin, I think you are overcomplicating things now. I don't see a real use case for your suggestion.
- Stupid Head
- Posts: 214
- Joined: Sat Aug 26, 2006 8:11 pm
- Location: USA
Done: http://hg.mozdev.org/adblockplus/rev/cf31fbc930ab
@Fox: Not sure whether this should be done, will think about it.
@Fox: Not sure whether this should be done, will think about it.
I was too eager to mark this as "done". There is still work left:
* "Disable on foo.com" should create a filter with flexible anchor
* Filter composer should be able to use flexible anchors (should that be the default for all suggestions?)
* Filter export from preferences should set ABP version to 1.1 if flexible anchors are found
Concerning flexible anchors at the end of the filter, I thought that the following definition would make sense:
foo|| means that "foo" should either be at the end of the address or it should be followed by a separator character. Separator characters are all characters but letters (need to recognize international letters somehow), digits, underscore, period, -, %. So || will be translated into something like:
([^\w\.\-%]|$)
So "||example.com||" will match "http://example.com/foo" and "http://example.com:1234/foo" but not "http://example.company.com/" and not "http://example.com.com/". Similarly "||example.com/foo||" will match "http://example.com/foo" and "http://example.com/foo/bar" but not "http://example.com/foobar".
Opinions?
* "Disable on foo.com" should create a filter with flexible anchor
* Filter composer should be able to use flexible anchors (should that be the default for all suggestions?)
* Filter export from preferences should set ABP version to 1.1 if flexible anchors are found
Concerning flexible anchors at the end of the filter, I thought that the following definition would make sense:
foo|| means that "foo" should either be at the end of the address or it should be followed by a separator character. Separator characters are all characters but letters (need to recognize international letters somehow), digits, underscore, period, -, %. So || will be translated into something like:
([^\w\.\-%]|$)
So "||example.com||" will match "http://example.com/foo" and "http://example.com:1234/foo" but not "http://example.company.com/" and not "http://example.com.com/". Similarly "||example.com/foo||" will match "http://example.com/foo" and "http://example.com/foo/bar" but not "http://example.com/foobar".
Opinions?