2006-02-07

microsummaries feature proposal and prototype

After receiving some positive feedback on my "son of live bookmarks" idea, I wrote up a more comprehensive description and proposal for a Firefox feature that supports the display and updating of microsummaries of web page content (tip o' the hat to Mike Shaver for the term "microsummaries").

And then, since it's easier to understand a proposal with some working sample code, I hacked up a prototype in the form of an extension you can install into a Places-enabled build (milestone 2 or newer).

The prototype comes with built-in support for three kinds of microsummaries: eBay auction items, Yahoo! Finance stock quotes, and Merriam-Webster's Word of the Day. Just throw a link to one of those pages onto your bookmarks toolbar, and the prototype will start updating it regularly with pertinent information (the price/time left in the auction, stock symbol/price, and word of the day, respectively).

Thanks to Brian Slesinsky, who graciously trilicensed XPath Checker code, the prototype also supports user-defined microsummaries. Just context-click on some text in a page, then select "Watch [the text]" from the context menu. The prototype will add a bookmark to your bookmarks toolbar whose title is the text you clicked on, then it'll update it regularly.

Here's a screenshot which demonstrates microsummaries for the three built-in types plus a user-defined microsummary (the number of lines changed in a Bonsai query for "Places checkins in the last day"):



Note that this is an early prototype. It has limited functionality, lots of bugs, and relies on the rapidly evolving Places code (so is susceptible to bustage). Don't rely on it or expect any eventual Firefox feature to work like it. It's just a proof of concept.

Also note that it'll take up to 15 seconds for a bookmark you add to the toolbar to start showing a microsummary. After that, the extension will update the microsummary every 30 minutes.

Finally, note that Places mucks with your profile, migrating history and bookmarks to new databases. Make sure you know what you're doing (or are using a fresh profile) when you try out a Places-enabled build.

If I haven't scared you away yet, then give it a try, and let me know what you think.

Microsummarizer 0.1

7 comments:

Grauw said...

Hey,

I hope you will also use page metainformation, and maybe see some potential to combine this with RDF.


~Grauw

Grauw said...

By the way, I love the places stuff :). Although it’s still a bit slow, and it would be great if the address bar would behave like a places history search as well.

To elaborate on my previous reply, e.g. the website of the company that I work at (http://www.backbase.com/) contains a meta keywords (and description) element. The keywords contain amongst others ‘SPI’, and it would be cool if it would then show up when I search for ‘SPI’ as well.

In the case of your extension, if I would add a link of type ‘alternate’ to my website, pointing to a description of the page contents in RDF (e.g. auction item information, or a ), that would be a more structured and general approach, and easier to parse instead of having to specify a specific extraction rule for each page, which might change when the page is redesigned.

You could also do stuff like hovering over a bookmark showing a tooltip with a larger amount of information, e.g. author and last 5 posts for a blog, number of bids for eBay, or a list of available beers and their breweries (see the beer ontology).

That would be totally awesome integration, and also stimulate people to add useful and interesting RDF representations of information.

You could then add default support and some understanding for a couple of common ontologies (and extend that list with new releases, or allow some method for this to be updated), and making this the default generic method for websites to add metainformation to their sites.

The only purpose of the site-specific ‘scrapers’ (as I believe they are called) would then be somewhat of an interim solution until the web is at 2.0 (or was it 3.0?), and adding RDF information is common, as of course in the end it’s not really maintainable to create such a scraper for every site out there.

Of course, this should all then really be built-in Firefox by default, and not come as an extension, so that people can really start using it :).

Ooh, I can’t wait! ^_^


~Grauw

Myk said...

I hope you will also use page metainformation, and maybe see some potential to combine this with RDF.

Indeed, I'd like it to be possible for pages to specify their own microsummaries, either with meta tags (for simple, embeddable summaries) or with link rel tags pointing to other resources that represent those summaries (for more complicated summaries, or to reduce the load cost of reloading the original page).

Those other resources could in some cases be feeds.

Ted Mielczarek said...

For the "built-in" pages it recognizes, does it just have a built in XPath expression, or is it using an API provided by the site? I know eBay has an API, but their terms of use might be incompatible with a Firefox extension. It certainly seems like the place you'd want to use it, though. Anyway, looks neat! I downloaded a Places build so I can try it out later. :)

Ben Goodger said...

Myk, this is an awesome idea! I look forward to seeing future developments.

Myk said...

Ted Mielczarek said:

For the "built-in" pages it recognizes, does it just have a built in XPath expression, or is it using an API provided by the site?

It just has a built-in XPath expression, or, more accurately, it has an XPath template consisting of one or more curly-bracket delineated XPath expressions interspersed within other text.

This way the built-in support can include more than one snippet of page content, and it can delineate those snippets. For the eBay support (which I call a microsummary "definition" for lack of a better term), the template looks something like this:

{expr selecting first five characters of name}... {expr selecting current bid} - {expr selecting time left}

I know eBay has an API, but their terms of use might be incompatible with a Firefox extension. It certainly seems like the place you'd want to use it, though.

It makes sense for Firefox to take advantage of better APIs for accessing summary info, but I also want to make sure users retain the power to define their own summaries, so support for XPath expressions is important, even if ultimately there will be standard ways for pages to specify their own microsummaries.

Isofarro said...

Very interesting and useful idea. It feels like a micro-scale RSS feed, but with customised data.

Since it seems to work along the lines of RSS - polling the end website, I strongly urge you to follow what's been a convention in RSS circles of not polling more often than once per hour. (I see you mention polling every 30 minutes).

Look forward to seeing this as a Firefox extension (or even a XulRunner based app :-)