Project OSRM is audiolizing the server query rate on our demo site. It’s kind of like a geiger counter for monitoring usage, letting people feel what is happening in an abstracted way. An often (implicitly) applied technique to do this is Well-Known Patterns: Things you recognize without thinking. Like green in traffic means go. Or an X that marks a spot in a pirate map. Or even the famous Munich 1972 Summer Olympics pictograms by Otl Aicher. The list goes on and on, but these things have one thing in common. The combination of simplicity and immediate recognizability. In our case, I went with a geiger counter – here is what last night sounded like: (mp3, 93kb):
Listening to it, each click is a distinct query that is answered. Each click is a piece of communication that is send and connects us to someone else. You immediately get the idea of how much traffic is going on. No charts, no numbers, only the power of sound and simplicity.
When it comes to data visualization there are all sorts of patterns that spring to mind. Arrows going up or down, pesky pie charts, columns, colors, dots, dashes, lines. The list is endless. And more than often these building blocks are plugged together without thought. Either telling the story wrong or at least making it harder to comprehend. Or not telling at all by not abstracting from the raw data. And the more complex the visualization become the less immediate recognizability they bear.
In terms of OSRM I wanted to make it clear to a non-technical person how much is happening right now on our server, i.e. how many queries do we serve. The aim, again, is just right now and it has to be simple. We don’t need fancy charts as we don’t need history. We want to be able to plug into what’s happening and as soon as we are happy get out. The naive way would be to pick a period of time, let’s say 10 seconds. And print something that represents this interval, e.g. a column that changes its height and perhaps also its colors depending on the number. This is not only boring. It has absolutely no esthetic appeal to it. As said above. It is naive. First, there is no beauty to it. It is a rectangle changing its size. If it had beauty one would enjoy watching the thing. That’s hardly imaginable. And then it’s not what we set out to do. It gives us information about an arbitrary interval, in our case 10 seconds. We could make the interval smaller to capture the current moment, but then we don’t see much. Or we could make it larger to aggregate more, but then … You probably guessed that already.
Recall that we need to capture the moment. Moments are funny things. They do not have fixed beginnings or endings. It is a piece of now taken from a a stream. And conceptually the stream of nows is running for ever. Also, we don’t know when an actual query is going to occur. We assume our users to be independent from each other. At most, we can say something about the probability that it occurs (by looking at query logs). Assume we had a log of several minutes and we look at the number of queries of each minute. It is fair to assume that the numbers are different. The more minutes we analyze the better our idea of the median becomes. The distribution of these minutely numbers is described by what is called a Poisson Distribution.
The Poisson distribution occurs in many situations of life. For example in geiger counters are measuring the rate of radioactive decay. We all recall the clicking sound. Here’s a simplified explanation of how it works. Each time it collects a ionized particle, it collects ionizing current which in turn drives a speaker. As it is a simple spike in current, it generates a click. The more events it collects the more clicks it generates. Imagine a strong source of radiation and its burst of clicks. This is our well-kown pattern. We are not building a visualization with it. We are building an audio-lization!
What we need to generate is a click each time we serve a query. This is easy to do given we have access to a server log. In our case, it is a Linux server and a OS X machine that plays a click. Long story short, here is a quick shell script:
#/bin/bashUSER=userSERVER=my.machine.comPATH_TO_LOG=/path/to/query.logAWK_MODE=#-Winteractive # disable buffering on Linuxexport PLAY_TOOL=afplay # use aplay -q on Linuxexport PATH_TO_CLICK_WAV=/path/to/geiger_click.wav
ssh $USER@$SERVER tail -f $PATH_TO_LOG|awk $AWK_MODE'{system("$PLAY_TOOL $PATH_TO_CLICK_WAV &")}'
and of course the click file (wav, 10k). As every piece of OSRM, it is of course BSD licensed.