2009-11-17

The Skinny on Raindrop's Mailing List Extensions

Raindrop is an exploration of messaging innovation that strives to intelligently assist people in managing their flood of incoming messages. And mailing lists are a common source of messages you need to manage. So, with assistance from the Raindrop hackers, I wrote extensions that make it easier to deal with messages from mailing lists.

Their goal is to soothe two particular pain points when dealing with mailing lists: grouping their messages together by list and unsubscribing from them once you're no longer interested in their subject matter.

This post explains how the extensions do this; touches on some aspects of Raindrop's message processing and data storage models; and speculates about possible future directions for the extensions.

Raindrop Extensibility

Raindrop is being built with the explicit goal of being broadly and deeply extensible, and it includes a number of APIs for adding and modifying functionality. The mailing list enhancements comprise two related extensions, one in the backend and one in the user interface.

The backend extension plugs into Raindrop's incoming message processor, intercepting incoming email messages and extracting info about the mailing lists to which they belong. It also handles much of the work of unsubscribing from a list.

The frontend extension plugs into Raindrop's Inflow application, modifying its interface to show you the most recent mailing list messages at a glance, group mailing list conversations together by list, and provide a button you can press to easily unsubscribe from a mailing list.

Message Processing and Data Storage

Before getting into how the extensions work, it's useful to know a bit about how Raindrop processes and stores messages.

Raindrop stores information using CouchDB, a document-centric database whose principal unit of information storage and retrieval is the document (the equivalent of a record in SQL databases). Documents are just JSON blobs that can contain arbitrary name -> value pairs (unlike SQL records, which can only contain values for predeclared columns).

To distinguish between different kinds of documents, Raindrop assigns each a schema (similar to a table in SQL parlance) that describes (and may one day constrain) its properties. The rd.msg.email schema is the primary schema representing an email message, while the rd.mailing-list is the schema representing a mailing list, and the rd.msg.email.mailing-list is a simple schema that associates messages with their lists.

(In an SQL database, rd.msg.email and rd.mailing-list would be tables whose rows represent email messages and mailing lists, while rd.msg.email.mailing-list would be a table whose rows map one to the other.)

Note that there's a many-to-one relationship between messages and lists, since messages belong to a single list, although lists contain many messages, so rd.msg.email.mailing-list isn't strictly necessary. Its list-id property (which identifies the list to which the message belongs) could simply be a property of rd.msg.email docs (or, in SQL terms, a foreign key in the rd.msg.email table).

But putting it into its own document has several advantages. First, it improves robustness, as it reduces the possibility of conflicts between extensions and core code writing to the same documents.

It also improves write performance, as it's faster to add a document than to modify an existing one (although index generation and read performance can be an issue).

Finally, it improves extensibility, because it makes it possible to write an extension that extends the backend mailing list extension.

That's because Raindrop's incoming message processing model allows extensions to observe the creation of any kind of document, including those created by other extensions.

So just as the mailing list extension observes the creation of rd.msg.email documents, another extension can observe the creation of rd.msg.email.mailing-list documents and process them further in some useful way. If the mailing list extension simply modified the original document instead of creating its own, that would require some additional and more complicated API.

The Backend Extension

The primary function of the backend extension is to examine every incoming message and dress the ones from mailing lists with some additional structured information that the frontend can use to organize them.

Backend extensions are accompanied by a JSON manifest that tells Raindrop what kinds of incoming documents it wants to intercept. The mailing list extension's manifest registers it as an observer of incoming rd.msg.email documents, which get created when Raindrop retrieves an email message:
"schemas" : {
"rd.ext.workqueue" : {
"source_schemas" : ["rd.msg.email"],
...

The extension itself is a Python script with a handler function that gets passed the rd.msg.email document and looks to see if it contains a List-ID header (or, in certain cases, another identifier) identifying the mailing list from which the message comes:
def handler(message):
...
if 'list-id' in message['headers']:
# Extract the ID and name of the mailing list from the list-id header.
# Some mailing lists give only the ID, but others (Google Groups,
# Mailman) provide both using the format 'NAME <id>', so we extract them
# separately if we detect that format.
list_id = message['headers']['list-id'][0]
...

If it doesn't find a list identifier, it simply returns, and Raindrop continues processing the message:
if not list_id:
logger.debug("NO LIST ID; ignoring message %s", message_id)
return

Otherwise, it calls Raindrop's emit_schema function to create an rd.msg.email.mailing-list document linking the message document to an rd.mailing-list document representing the mailing list:
emit_schema('rd.msg.email.mailing-list', { 'list_id': list_id })

In this function call, rd.msg.email.mailing-list is the type of document to create, while { 'list_id': list_id } is the document itself, written as Python that will get serialized to JSON.

A document created inside a backend extension like this automatically gets a reference to the document the extension is processing (i.e. the rd.msg.email document), so the only thing it has to explicitly include is a reference to the list document, in the form of a list_id property whose value is the list identifier.

The extension also checks if there's an rd.mailing-list document in the database for the mailing list itself, and if not, it creates one, populating it with information from the message's List-* headers, like how to unsubscribe from the list. Otherwise, it updates the existing mailing list document if the message's List-* headers contain updates.

The Frontend Extension

The frontend extension uses the information extracted by the backend to help users manage mailing lists in the Inflow application.

It adds a widget to the Home view that shows you the last few messages from your lists at the bottom of the page, so you can keep an eye on those messages without having to give them your full attention:




It adds a list of your mailing lists to the Organizer widget:




And when you click on the name of a list, it shows you its conversations in the conversation pane:




In traditional mail clients, users who want to break out their list messages into separate buckets like this typically have to create a folder for each list to contain its messages and then a filter for each list to move incoming list messages into the appropriate folders. The extension does this for you automatically!

Finally, while viewing list conversations, if the extension knows how to unsubscribe you from the list, it displays an Unsubscribe button:




Pressing the button (and then confirming your decision) unsubscribes you from the list. You don't have to do anything else, like remembering your username/password for some web page, sending an email, or confirming your request with the list admin. The extensions handle all those details for you so you don't have to know about them!

List Unsubscription

In case you do want to know the details, however, it goes like this...

First, the frontend extension sends a message to the list's admin address requesting unsubscription, with a certain command (like "unsubscribe") in the subject or body of the message (lists often specify exactly what command to send in the mailto: link they include in the List-Unsubscribe header):
From: Jan Reilly 
To: wasbigtalk-admin@example.com
Subject: unsubscribe

Then the server responds with a message requesting confirmation of the request, often putting a unique token into the Subject or Reply-To header to track the request:
From: wasbigtalk-admin@example.com
To: jan@example.com
Subject: please confirm unsubscribe from wasbigtalk (4bc3b7e439fd)

Hello jan@example.com,

We have received a request to unsubscribe you from wasbigtalk.
Please confirm this request to unsubscribe by replying to this email.
...

Then the backend extension responds with a message confirming the request that includes the unique token:
From: jan@example.com
To: wasbigtalk-admin@example.com
Subject: Re: please confirm unsubscribe from wasbigtalk (4bc3b7e439fd)

Finally, the server responds with a message confirming that the subscriber has, indeed, been unsubscribed:
From: wasbigtalk-admin@example.com
To: jan@example.com
Subject: you have been unsubscribed from wasbigtalk

Hello jan@example.com,

Your unsubscription from wasbigtalk was successful.
...

At this point, the backend extension marks the list unsubscribed in the database, and the frontend extension marks it unsubscribed in the user interface.

This process matches the way much mailing list server software works, although there are daemons in the details, so the extensions have to be programmed to support each server individually.

Currently, they know how to handle Google Groups and Mailman lists. Majordomo2 (used by the Bugzilla and OpenBSD projects, among others) is not supported, because it doesn't send List-* headers (alhough supposedly it can be configured to do so). The W3C's list server is not yet supported, although it does send List-* headers, and support should be fairly easy to add.

Note that some of the processing the extension does is (locale-dependent) "screen"-scraping, as Google Groups and Mailman don't consistently identify the list ID and message type in some of their correspondence. In the long run, hopefully server software will improve in that regard. Perhaps someone can spearhead an effort to make it so?

The Future

The extensions' current features fit in well with Raindrop's goal of helping people better handle their flood of incoming messages. But there is surely much more they could do to help in this regard.

Besides general improvements to reliability and robustness--like support for additional list servers and handling of localized admin messages--they could let you resubscribe to a mailing list from which you've unsubscribed. And perhaps they could automatically fetch the messages you missed while you were away. Or even retrieve the entire archive of a list to which you're subscribed, so you can browse the archive in Raindrop!

What bugs you about mailing lists? And how might Raindrop's mailing list extensions make them easier (and even funner) to use?

7 comments:

Christopher said...

Is the extension available?

Cheers,

christopher

Myk said...

Christopher: yes, it's available with Raindrop itself! It's an "extension" in the sense that it plugs into Raindrop exclusively using Raindrop's extension APIs, although Raindrop comes with it by default because its functionality seems useful enough to enough people to warrant it being part of the default bundle.

So you can get it the same way you get Raindrop itself, for which see Installing Raindrop.

andrew james said...

I am happy to find raindrop.

As I hurried to read this log of your work with mail lists, a service that I use often, I am interested to how raindrop compares to thunderbird.

In thunderbird, if a mail list is at a reserver like gmane, I can browse the archive. To do that, create an account type 'newsgroup'. But, I think, that type of account is disorderly for some uses.

Examples of disorderlyness,

limit of one mail address per account but I want to create one account for gmane. Gmane is thousands of groups, sometime I may want to link a mail account per group, or not.

There is no easy 'subscribe' to group and also to subscribe with a specific mail address. For some groups, 'subscription' is necessary to send messages.

Raindrop may have the 'subscribe' but one conclusive request of the log was to link messages from archives. Servers like 'gmane' or 'mailman' are likely the easiest source of archives to implement.

Archives are useful for example, to find answers to questions in old conversations rather than duplicate a question.

The problem is to program each server as an extension. Another problem is to prevent or dispose duplicate mail to the address.

For example a hypothetical in ideal case solution is to use 'mailman' to subscribe for permit to send messages and switch 'no mail delivery'. Then, all messages for that group are served from gmane.

One limit is that not all groups have archives at servers like gmane. Then, the result is a 'sorry no archives for group' message or something polite.


andrew

Christopher said...

Ah, that makes sense. Thanks for quick responce. :)

trisha, graphics designer said...

Looks interesting, I might give it a try, I keep trying to find a really good email client..thanks for sharing.

gfhuertac said...

Hi:
I created a new extension based on the the one that transforms the body in quoted, but I fail to install it.
Is it enough just to copy the .py and .json at the couch_docs/wq folder?
If I run the run_raindrop with the add-schemas option there is an error about Twistted.
Any hints?

Myk said...

gfheurtac: I think copying the extension files to couch_docs/wq and then rerunning run-raindrop.py is sufficient, although I'm not positive.

Perhaps you can post about your problem to the raindrop discussion forum or in the chat room? All the smart Raindrop hackers are in those places and much better able to help.