Adblock Plus and (a little) more

Adding Weave support to an extension · 2009-06-01 14:00 by Wladimir Palant

Disclaimer: The following article is primarily meant for developers. Also, the solutions and explanations presented are my personal findings and not confirmed by the Weave team.

Weave is a pretty cool extension, it lets you sync your Firefox/Fennec data (history, bookmarks etc.) across multiple computers via some server — and that while the browser is running, no need to restart it. So for example, you might have Fennec running on a Pocket PC and Firefox on your desktop, both logged into the same Weave account. You add a bookmark in Firefox. On next sync (which you can trigger manually) this change will be uploaded to the server. And next time Fennec syncs this bookmark will simply appear in your bookmarks list while the browser is running. Weave works with single bookmarks rather than taking the list as a whole meaning that this update will not affect the other bookmark changes you’ve performed in Fennec in the meantime (those will simply be uploaded to the server so that Firefox can get them on next sync). Add to this encryption, all data is being encrypted locally before being sent to the server — so while the server can tell how many bookmarks you have, it won’t be able to read the actual data. Still, if you don’t trust Mozilla’s server you can run your own without much trouble.

So far the general outline, really great feature. My question as extension developer: how can I use it? People often ask how they can share Adblock Plus preferences across multiple computers (or sometimes even different profiles on the same computer). It would be nice to point them to Weave, install it and everything works automatically. So, two weeks ago I started looking into what is necessary to make your extension work with Weave. There is some documentation but it fails badly on giving an overview, also it seems to be targeted primarily at Weave developers rather than third-parties wishing to integrate with Weave. So I will try to summarize some things that I learned.

General setup

To integrate with Weave you need to write a new data provider for it. This isn’t an entirely trivial task due to the way Weave operates. Weave works with single data records, that way it can efficiently compare remote data with the local data and upload/update only the necessary parts. You need a Store component that manages records — provides Weave with records created from local data and updates local data from remote records when necessary. You also need a Tracker component that remembers all locally changed records. Finally, there is the actual data provider component called SyncEngine that puts everything together.

Putting all your data into one record is still a possibility of course. However, then you will waste traffic by uploading all data on every sync. Also, each remote update will overwrite all local changes — not exactly desirable behavior. So it is best to choose data granularity as small as possible. My approach to this is at roughly 350 lines of code right now.

Forward compatibility

It is important to mention that Weave is still far away from a release version. It is currently an experimental extension and you should treat it as such. This is not just about bugs, it is simply about the fact that pretty much everything is still subject to change. For example, the API changed between Weave 0.3.0 and Weave 0.3.2 in an incompatible way, meaning that you have to use tricks to support both.

Side note, if somebody wants the details: Weave 0.3.0 marked deleted records by means of a payload property which you would set to null. Weave 0.3.2 replaced it by a boolean deleted property. The other difference: Engines.register() takes a class constructor in Weave 0.3.2, in Weave 0.3.0 you would give it a class instance.

There is no reason to expect the API to be stable from now on, or even to expect that API changes can be detected programmatically. After some consideration, I decided to move Weave support into a separate extension (Weave Sync for Adblock Plus) that can have explicit dependencies on Adblock Plus and Weave 0.3.0/0.3.2. It will explicitly not work with any other Weave versions without an update, better that than the risk of breaking everything. I hope that this can be made less strict later but right now there doesn’t seem to be a better way.

The other aspect here is the storage format. Weave takes care of storage format changes but re-uploading all data to the server on Weave updates. So if the previous Weave version used a different format, on update the server data will be automatically replaced with data in new format. Unfortunately, that doesn’t work for extensions that might update independently of Weave and will occasionally change their storage format as well. For now, the best solution I found is to change engine URL every time this happens (override SyncEngine.engineURL getter), this will make sure that the data is stored in a different directory on server. I asked for a generic solution in Weave, let’s hope future Weave versions will make that hack unnecessary.

Records

A fact that is not obvious from the documentation: there are two different kinds of records, namely remote records and local records. Remote records are created for updates that are received from server. For that Weave will simply create a new record instance and set its cleartext property. Remote records are what you will get in calls meant to update local data (Store.remove() and similar). Local records are created by means of Store.createRecord() on Weave’s requests, the Store component is then responsible for initializing the cleartext property to reflect the local data. All the additional properties are only helpers to read out / write the cleartext property. In both cases the records are temporary objects and it is not necessary to update their data after creation. This is even true if records are cached as suggested in the documentation, Weave will clear the cache when necessary (in my tests I didn’t see a single cache hit because of that, this will hopefully improve toward the final Weave release).

Each record should have an ID that identifies it. Something that the documentation doesn’t clearly state: this ID can be any string, there is no need to use the GUID format. However, it is important to remember that unlike the data the ID will not be encrypted. So if the canonical ID of your data is the data itself (e.g. a URL) you better don’t use it like this. Instead, you can take a SHA1 hash of the data for example which will still keep your IDs unique (ok, hash collisions happen but they are extremely unlikely) but prevent reconstructing the data from the ID.

Store

When implementing the Store component, I noticed some oddities. First was the Store.changeItemId() method, its purpose was entirely unclear. After studying the source code I realized that Weave doesn’t trust the same records to always have the same ID, so occasionally it will compare the data of two records. If it finds that a remote and a local record have the same data it will call Store.changeItemId() to change the ID of the local record. But for me two identical records always have the same ID as well so I replaced SyncEngine._recordLike() method by one that always returns false, this saves some unnecessary computations (and makes sure that Store.changeItemId() is guaranteed never called).

The other oddity is the Store.wipe() method. This is basically a kill switch, the store has to remove all data from the client when this method is called. Occasionally I found out that this method is being called when one client decides to re-upload server data — all other clients get the command to wipe the local data and to download new data from the server. So it all makes sense. Still, it is somewhat concerning that a malicious or malfunctioning server might simply remove all data you have locally, but I guess it could do that anyway using the usual syncing mechanisms.

Hooking up

For the code to work you need to import several Weave modules, at the very least the modules containing the base classes to be extended:

Components.utils.import("resource://weave/engines.js");
Components.utils.import("resource://weave/stores.js");
Components.utils.import("resource://weave/trackers.js");
Components.utils.import("resource://weave/base_records/crypto.js");

After all the relevant classes have been defined you need to register your SyncEngine class:

Engines.register(new MyEngineClass());

Note that this is the registration call for Weave 0.3.0, in Weave 0.3.2 you pass the class constructor: Engines.register(MyEngineClass).

Finally, you will need to make sure that your sync engine is enabled (by default it isn’t). This is controlled by the boolean preference weave.engine.<sync engine name>, the extension should define the default value for that preference as true. Unfortunately, in the current Weave versions your engine won’t appear in the list of data sources meaning that the user won’t be able to disable it. This will be fixed in Weave 0.4, so please don’t go hack the Weave preferences page as the documentation seems to suggest.

Tags:

Comment [1]

  1. eupator · 2009-06-03 02:26 · #

    > However, it is important to remember that unlike the data the ID
    > will not be encrypted. So if the canonical ID of your data is the
    > data itself (e.g. a URL) you better don’t use it like this. Instead,
    > you can take a SHA1 hash of the data for example which will still
    > keep your IDs unique (ok, hash collisions happen but they are
    > extremely unlikely) but prevent reconstructing the data from the ID.

    This is not enough because of dictionary attack. Given a filtering rule, you can check which users use it. To make things worse, it is not so hard to construct a list of potential rules including almost all (in the weighted sense) rules in use.

    You need to hash every rule together with a secret key, known or available only to the installations you are syncing with. (Probably even better would be if Weave did it for all IDs, as I guess ABP may be not the only extension with this problem.)

    Reply from Wladimir Palant:

    True, this is an issue. So it will need a salt, maybe the SHA1 hash of user’s pass phrase.

    I agree, it would be nice if Weave encrypted IDs. Not sure whether they want to go there however.

Commenting is closed for this article.