Monthly Archives: May 2009

Tarpipe REST connector in 5 minutes

Tarpipe implemented a REST connector a short while ago. This is something that I and others have been wanting for a while now, so it’s great news. The announcement was quite short and didn’t have much detail. I like to see things visually, and I’m guessing others do too, so I decided to write a little handler to receive a sample request from the REST connector to dump it for inspection.

As Bruno showed in the announcement, this is what the REST connector looks like:

Tarpipe REST connector

It will take whatever values it receives in the title, description and link input fields on the left hand side of the connector, and construct a piece of JSON which it then sends in an application/x-www-form-urlencoded format as a data=<JSON> name/value pair in the message body of an HTTP POST request to the resource specified in the serviceUrl field.

So if we pass the values “DJ’s Weblog” into the title, “Reserving the right to be wrong” into the description, “http://www.pipetree.com/qmacro/blog/” into the link fields, and pass “http://example.org/bucket/” into the serviceUrl field, the following HTTP request is made on the http://example.org/bucket/ resource like this:

POST /bucket/ HTTP/1.1
Content-Length: 218
Content-Type: application/x-www-form-urlencoded
Host: example.org
Accept: */*

data=%7B%22items%22%3A%5B%7B%22title%22%3A%22DJ%27s+Weblog%22%2C%22description
     %22%3A%22Reserving+the+right+to+be+wrong%5Cn%22%2C%22link%22%3A%22http%3A
     %5C%2F%5C%2Fwww.pipetree.com%5C%2Fqmacro%5C%2Fblog%5C%2F%22%7D%5D%7D

(whitespace added by me for readability).

When decoded and pretty-printed, that message body looks like this

data={
    "items":[
       {
           "title":"DJ's+Weblog",
           "description":"Reserving+the+right+to+be+wrong",
           "link":"http://www.pipetree.com/qmacro/blog/"
       }
    ]
}

This is what your app gets to process.

Bruno said that the format was chosen to be compatible with the Yahoo! Pipes Web Service Module, and it sure is — look at this example from the Web Service Module documentation:

data={
    "items":[
       {
           "title": "First Title",
           "link": "http://example.com/first",
           "description": "First Description"
       },
       {
           "title": "Last Title",
           "link": "http://example.com/last",
           "description": "Last Description"
       }
    ]
}

And what about those three output fields on the right hand side of the REST connector? Well, if your app returns a response with JSON in the body — this time not as a name/value pair, but as pure JSON — like this:

{
  "items":[
     {
         "title": "The response!",
         "description": "Long text description of the response",
         "link": "http://example.org/banana/"
     }
  ]
}

then the workflow can continue and you can connect those values in the corresponding title, description and link output fields as input to further connectors.

Happy tarpiping!

Twitter’s success

Yes yes, I know I’m late to the game, and everyone and his dog has given their angle on why Twitter is so successful, but I’d like to weigh in with a few thoughts too. The thoughts are those that came together when I was chatting to Ian Forrester (@cubicgarden), at a GeekUp event in Manchester last week.

Messaging Systems

Back in the day, I talked about, wrote about and indeed built interconnected messaging systems based around the idea of a message bus, that has human, system and bot participation. The fundamental idea was based around one or more channels, rooms or groupings of messages; messages which could be originated from any participant, and likewise filtered, consumed and acted upon by any other. I wrote a couple of articles positing that bots might be the command line of the future.

Using my favourite messaging protocol, I built such a messaging system for an enterprise client. This system was based around a series of rooms, and had a number of small-but-perfectly-formed agents that threw information onto the message bus, information such as messages resulting from monitoring systems across the network (“disk space threshold reached”, “System X is not responding”, “File received from external source”, etc) and messages from SAP systems (“Sales Order nnn received”, “Transport xxx released“, “Purchase Order yyy above value z created”, etc). It also had a complement of agents that listened to that RSS/ATOM-sourced stream of enterprise consciousness and acted upon messages they were designed to filter — sending an SMS message here, emailing there, re-messaging onto a different bus or system elsewhere.

So what does this have to do with Twitter? Well, Twitter is a messaging system too. And Twitter’s ‘timeline’ concept is similar to the above message groupings. People, systems and bots can and do (I hesitate to say ‘publish’ and ‘subscribe to’ here) create, share and consume messages very easily.

Killer Feature

But the killer feature is that Twitter espouses the guiding design principle:

Everything has a URL

and everything is available via the lingua franca of today’s interconnected systems — HTTP. Timelines (message groupings) have URLs. Message producers and consumers have URLs. Crucially, individual messages have URLs (this is why I could refer to a particular tweet at the start of this post). All the moving parts of this microblogging mechanism are first class citizens on the web. Twitter exposes message data as feeds, too.

Even Twitter’s API, while not entirely RESTful, is certainly facing in the right direction, exposing information and functionality via simple URLs and readily consumable formats (XML, JSON). The simplest thing that could possibly work usually does, enabling the “small pieces, loosely joined” approach that lets you pipeline the web, like this:

dj@giant:~$ GET http://twitter.com/users/show/qmacro.json |
              perl -MJSON -e "print from_json(<>)->{'location'},qq/n/"
Manchester, England
dj@giant:~$

None of this opaque, heavy and expensive SOA stuff here, thank you very much.

Other Microblogging Systems and Decentralisation

And does this feature set apply only to Twitter? Of course not. Other microblogging systems, notably laconi.ca — most well known for the public instance identi.ca — follow these guiding design principles too.

What’s fascinating about laconi.ca is that just as a company that wants to keep message traffic within the enterprise can run their own mail server (SMTP) and instant messaging & presence server (Jabber/XMPP), so also can laconi.ca be used within a company for instant and flexible enterprise social messaging, especially when combined with enterprise RSS. But that’s a story for another post :-)

Analysing CV searches with Delicious

I put my CV online recently, and having the machine that serves this website (an iMac running Ubuntu Linux) sitting in the study, I can almost ‘feel’ the HTTP requests entering the house, going down the wire, and being served, like lumps travelling down a pipe in a Tom & Jerry cartoon.

So I was thinking about doing something useful with Apache’s access log, more than what I already have with the excellent Webalizer. Inspired (as ever) by Jon Udell‘s “ongoing fascination with Delicious as a user-programmable database“, I decided to pipe the access log into a Perl script and pull all the Google search referrer URLs that led to /qmacro/CV.html. For every referrer URL found, I grabbed the query string that was used and split it into words, removing noise. I also made a note of the top level domain for the Google hostname – a very rough indication of where queries were coming from.

But rather than create a database, or even an application, to analyse the results, I just posted the information as bookmarks to Delicious (after a simple incantation of perl -MCPAN -e ‘install Net::Delicious- just what I needed, thanks!).

Delicious *is* a database, and by its very nature and purpose has a flavour that lends itself very well to loosely coupled data processing and manipulation. It’s about URLs and tags. It’s about adding data, replacing data, removing data. Basic building blocks and functions. Every item in the database has, and is keyed by, a URL, and as such, every item is recognised and treated as a first class citizen on the web. Even the metadata (tag information) is treated the same.

So what did I end up with? Well, for a start, I have a useful collection of referring CV search URLs, the collection being made via a common grouping tag ‘cvsearchkeywords‘ that I assigned to each Delicious post in addition to the tags derived from the query string.

CV search keywords on Delicious

I also have a useful analysis of the search keywords, in the list of “Related Tags” – tags related to the common grouping tag. I can see right now for example that beyond the obvious ones such as “cv”, popular keywords are abap, architect and developer. What’s more, that analysis is interactive. Delicious’s UI design, and moreover its excellent URL design, means that I can drill down and across to find out what keywords were commonly used with others, for example.

That collection, and that analysis, will grow automatically as soon as I add the script to the logrotate mechanism on the server. That is, of course, assuming people remain interested in my CV!

And my favourite referrer search string so far? “How to write a CV of a DJ” :-)