[Done] Extending filter syntax

Various discussions related to Adblock Plus development
User avatar
chewey
Posts: 501
Joined: Wed Jun 14, 2006 10:34 pm
Location: somewhere in Europe

Post by chewey »

On the one hand it would help filterset authors. But on the other hand it makes filterlist creation/filter possibilities much more complex.
This is a good thing. It makes ABP more powerful and allows for shorter rules to be more precise.
Already today creating filters is not easy, because many people does not use wildcards.
Creating a good filter list beyond a simple wildcard-less URL list never was easy in the first place.
Seeing similarities in different URLs and using them to make use of wildcards can be tricky and needs some experience.
But when this would be introduced, this would become a complete overkill; even skilled users will be confused.
Nobody forces you to use new syntax capabilities. This is an extension to existing syntax, so everything will stay valid.

Make a new column, where such options can be saved. Generally it would be clear or set to "always". But the user can edit this aditional field and set a limitation (e.g. "image" or "image,object")
Like a permission mask - that's actually not too bad an idea.

But I'm not sure if a field of checkboxes allowing for negation etc. for every available selector would be much easier. I also see a problem in reflecting a filter line's mode in its row without making the row too large. Additionally, there is still need for a representation in plain text for exported lists, so why not file this syntax in the same "advanced and experienced users only"-corner as ABP's ##-capabilities? I'm sure they can be confusing as hell too...
Wladimir Palant

Post by Wladimir Palant »

@chewey: Forgetting boolean algebra for a moment - isn't this the way you would expect it to work?

@Guest: That's exactly what I would like to prevent - new users aren't supposed to use this feature. You have to read at least a little documentation to understand what limiting a filter to third-party content or certain content types means.

Introducing a feature solely as a syntax extension means - you don't need to use it, you even don't need to know about it. But if you do advanced things (like making filter lists) and you realize that you need something like this - you can always find and use it.

That doesn't mean we can't do something wizard-like here, but it really doesn't need a prominent place.
User avatar
chewey
Posts: 501
Joined: Wed Jun 14, 2006 10:34 pm
Location: somewhere in Europe

Post by chewey »

Wladimir Palant wrote:Forgetting boolean algebra for a moment - isn't this the way you would expect it to work?
OK, I gave you the Benefit Of The Author and thought about it again, copying your examples and your explanation close to each other for reference. After quite some thinking, my result is:
Maybe - or, (hum) in fact, (humhum): Yes :D

I am victim of too much abstract thinking, because I assumed any $foo could - at least sometimes - also be of the $bar type. Now I tried to build an example to illustrate the problem I had with this - and failed miserably. It is of course impossible for (e.g.) a script to be an image or an object at the same time, so I agree with your handling of lists and negation if the type-selectors are always exclusive.

However, although I understand the role of third-party as an additional modifier and not as a selector by itself, where ...$third-party is in fact short for ...$*,third-party, I don't like this too much:
I'd really like to see something like banner$third-party, object interpreted as "matching every third-party-element containing banner, but non-third-party-elements only if they are of type 'object'" - this seems to be impossible using the current syntax.
Maybe with parentheses as in banner$(object), (*,third-party)?

Do we have any special characters for use as third-party-modifier left? :-)
User avatar
rick752
Posts: 2709
Joined: Fri Jun 09, 2006 7:59 pm
Location: New York USA
Contact:

Post by rick752 »

OK, Wladimir

After our little "go around" on the other forum yesterday ... I'm convinced :)

Even if these syntax features are only used at times, they offer more flexibility to filter-making. Any extra flexibility is a GOOD thing in my book and I'm sure that we can use these to many advantages after playing around with them on different ad structures vs site structures. I can definitely see some advantages after reading some of the other posts here.

ps: I already have one place in mind for a test (youhoo know who that is) :lol:
IceDogg
Posts: 909
Joined: Fri Jun 09, 2006 11:22 pm

Post by IceDogg »

Does the new dev build you just made have this feature in it too? or not yet? Sorry if I ask before you had a chance to post, I just wanted to know before I use the dev build. Thanks for your time.
Wladimir Palant

Post by Wladimir Palant »

No, I will announce the feature when I implement it ;)
Wladimir Palant

Post by Wladimir Palant »

The type options come out with the next development build, they seem to work fine. However, third-party option turned out to be problematic. If we support it, then we can forget about caching results - right now Adblock Plus remembers its decision for items it has seen and if we get the same item again we won't re-run the checks. With the third-party option the result becomes dependent not only on the item itself but also on its context - and this makes caching very difficult. Until I find a way to do this without having an impact on the performance this option won't be implemented.

But I decided to implement another option. "AdBanner$match-case" will block ht-tp://server.com/AdBanner.gif but not ht-tp://server.com/adbanner.gif. This won't be useful too often but I thought it should be done for completeness' sake.
IceDogg
Posts: 909
Joined: Fri Jun 09, 2006 11:22 pm

Post by IceDogg »

Thanks for the update WP on these. You're doing great! I wish all extension developers worked on their extensions actively, like you do.
Fox
Posts: 300
Joined: Sat Jun 10, 2006 3:05 pm
Location: Finland

Post by Fox »

I did try to do some site whitelisting and it fails, it still whitelistes all.
@@|http://www.mozillazine.org/*$~script
@@|http://www.mozillazine.org/*$~image
@@|http://www.mozillazine.org/*$~third-party

So is it bug or limitation.
Wladimir Palant

Post by Wladimir Palant »

Code: Select all

@@|http://www.mozillazine.org/*$~document
Fox
Posts: 300
Joined: Sat Jun 10, 2006 3:05 pm
Location: Finland

Post by Fox »

Thanks, now i did try to use these with no luck:
*banner*$third-party
*$image,third-party

Those also block all first party stuff.
Wladimir Palant

Post by Wladimir Palant »

As I said above - third-party option isn't being implemented, at least for now.
Fox
Posts: 300
Joined: Sat Jun 10, 2006 3:05 pm
Location: Finland

Post by Fox »

Sorry:(

Now i think i did find a bug.
This Disney Cars -movie ads picture is my test page.

Lets pretend that it's a advert we wanna block.
Then */ads/*$image
blocks it, but if filterlist has:
*/ads/*$image
*/ads/*$object
Then it's not blocked.

I know it works if rule is like:
*/ads/*$image,object
but i think rules are better if i can have two or more and then disable or delete that one that makes too many false positives.
And i was going to use these.

Edit: Now it works, i don't know what was going on :(
Fox
Posts: 300
Joined: Sat Jun 10, 2006 3:05 pm
Location: Finland

Post by Fox »

Fox wrote:This Disney Cars -movie ads picture is my test page.
If filters are:
*/cars/*$image
*/cars/*$script

Then that script is not blocked there, if image filter is disabled or deleted then it's blocked.

ps: stupid filters, i did just wanna match all elements there.
Wladimir Palant

Post by Wladimir Palant »

@Fox: Found the problem, thanks a lot. Identical filters are ignored - and identical is still defined in terms of regexps, meaning that only one of your two rules gets applied. I will fix this.

Edit: fixed, wait for the next development build.
Locked