Adblock Plus and (a little) more

Internationalized domains in filters are now expected to be encoded as Punycode · 2018-05-18 17:00 by Sebastian Noack

Historically, Adblock Plus converted internationalized domains given in URLs reported by the browser from Punycode (e.g. xn--i-7iq.ws) to Unicode (e.g. i❤.ws), so that filters could be written using the Unicode representation (rather than Punycode).

The assumption was that filter list authors feel more comfortable spelling out internationalized domains in their native alphabet (rather than bothering about an obscure representation like Punycode). However, in practice the opposite was the case, since when inspecting the source code, the DOM or HTTP requests, internationalized domains are encoded as Punycode. Things got particularly confusing with Unicode characters that can be constructed in different ways, resulting in different Punycode, but looking the same when rendered as Unicode. Furthermore, since other ad blockers didn’t implement these semantics, some filter lists are currently specifying filters/domains redundantly both with Punycode and Unicode encoding.

Therefore we decided to require domains in filters being encoded as Punycode, starting with Adblock Plus 3.2 (and development builds as of 3.1.0.2050), which as a side effect also makes Adblock Plus a little bit more efficient (issue 6647).

There is a script, filter lists authors can use in order to convert existing filter lists.

To the install page

Tags:

Comment

Commenting is closed for this article.