Page 2 of 2

PostPosted: Thu Oct 30, 2008 8:51 am
by Wladimir Palant
No, the space before "Checksum" doesn't matter. The regular expression used to recognize the checksum is:

Code: Select all
/!\s*checksum[\s\-:]+([\w\+\/]+)/

PostPosted: Thu Oct 30, 2008 1:37 pm
by Stupid Head
Not a big fan of Perl, so I thought I'd post these alternatives as a reference. DATA is a [add: UTF-8] string of the entire adblock.txt [add: with normalized unix linebreaks]. For some reason, there are two trailing equal signs in the checksums from PHP and Python, but not from Perl.

PHP:
Code: Select all
echo "! Checksum: ".rtrim(base64_encode(md5($DATA, true)), "=")."\n"


Python (2.5 and later):
Code: Select all
import base64, hashlib
print "! Checksum: " + base64.b64encode(hashlib.md5(DATA).digest()).rstrip("=")


If there are no problems, I'm going to add the checksum to my list soon.

PostPosted: Thu Oct 30, 2008 2:29 pm
by Wladimir Palant
Yes, Digest::MD5 has its own base64 variation, without the "=" signs at the end. That's documented.

The checksum generators look correct - but they should also normalize line breaks (strip CR aka "\r" and replace multiple LF aka "\n" symbols in a row by one). That's to prevent irrelevant changes to the file (converting between different line ending styles, inserting empty lines) from changing the checksum.

And, of course, the generators should always be applied to UTF-8 encoded data. Maybe I should extend the reference script with a check for valid UTF-8.

PostPosted: Thu Oct 30, 2008 3:56 pm
by Stupid Head
So Adblock Plus assumes utf-8... That explains why Korean text displays correctly in ABP even though it's served as Latin-1 by the EasyList server.

PostPosted: Thu Oct 30, 2008 4:23 pm
by Wladimir Palant
I was actually wondering about that as well. That is not something I did, apparently XMLHttpRequest uses UTF-8 as default. You should be able to set a different character set explicitly - which doesn't change the fact that the checksum has to be calculated for the UTF-8 representation of the text.

PostPosted: Thu Oct 30, 2008 9:19 pm
by lovelywcm
Although unlikely, the checksum itself can be changed or simply removed.

Is it possible to sign a list with gpg, then verify it after download (include maintainer's public key along with ablockplus.xpi)?

PostPosted: Thu Oct 30, 2008 9:32 pm
by Wladimir Palant
This is not about malicious manipulations - if you worry about that you should use HTTPS (yes, StartSSL has HTTPS certificates for free). The point is simply to make sure that various antiviruses, firewalls and bad proxy servers don't interfere with the download. That's also the reason why the checksum is the very first "filter" - if the download is cut off the checksum will still be there.

PostPosted: Thu Oct 30, 2008 11:24 pm
by lovelywcm
Well, Google code where ChinaList is hosted don't allow anonymous users to download a raw file via https.

Thanks God, as far as now ChinaList hasn't been attacked, so I don't need to move it to another place.