Jun 4, 2010
I’m enjoying the latest flowerings of open data, and the recent quality posts from Ingrid Koehler and Steph Gray on what it all might mean. As well as quality action from Rewired State and others to actually demonstrate it in practice. (ooh, I just spotted that a reel of my photos is running on the Rewired State home page – thanks guys)
We’re getting a better understanding of what data actually is now that we’re seeing more of the things that were previously tucked away.
I’ll add my own observations: it helps me, at least, when thinking about complicated things to break them down a bit. My suggestion is to think in terms of four broad types:
1. Historical data
What’s happened in the past: how organisations and people have performed – what’s been said in meetings – what’s been spent – where the pollution has been – how children performed in tests…
2. Planning data
What’s projected to happen, or will shape what will happen: this and next year’s budget – legislation in progress – consultations – proposed housing developments – manifestos…
3. Infrastructural data
The building blocks of useful services. Boring stuff, doesn’t change that often, but when it does, it needs to be swiftly and accurately updated: postcodes – boundaries – base maps – contact directories – opening hours – organisation structures – “find my nearest…”
4. Operational data
The real-time stuff; what’s happening NOW: where’s my train/bus? – crime in progress – emergency information – school closures – traffic reports – happening in your area today…
These are not unrelated: what’s happened in the past will often guide what’s planned for the future. Today’s operational information becomes tomorrow’s history. And so on. There’s plenty of overlap. They’re intended as concepts, not hard definitions. The types can also be combined in every way conceivable: that’s part of the point of releasing the data in the first place.
I’m deliberately drawing no great distinction here between ‘information’ and ‘data’: the latter is a structured, interpretable incarnation of the former. That’s another set of issues in itself. I’ve also skipped over questions of interpretation and spin – this is a blog post, not a chapter of my book ;) And I’ve omitted “personal data” as a type – this is woven through all areas and carries with it its own baggage. I’m thinking more about the basics of function and purpose. Which lead on to usefulness. Which, as I’ve said before, is the test that all this is taking us in the right direction.
“Useful to whom” does of course vary by type: 1 and 2 are great for those holding public service to account (press, public, whoever). 2 is for those who will make change happen. 3 will benefit of ordinary people in day-to-day life (and I’m careful here not to imply that these ordinary people ever have to see ‘data’ or an ‘e-service’ themselves: their local paper, toddler group, or community centre noticeboard are all valid intermediaries here). 4 will do things for the e-enabled – the mobile generation, the data natives, as well as for places that can serve an offline public (screens in train stations, visuals at bus-stops).
As a practical suggestion, I would love to see some of the current initiatives to build repositories and access to data recognising these distinctions exist. A little more signposting about the type of data that’s being released may help to highlight which types are being overlooked. For as we know, opening up the narrative helps to drive the change itself.
And how are we doing against these four types?
Pretty good on historical (it’s quite easy to dump old files online); weak on the future planning stuff (trickier, because if there’s no means of action accompanying the data, will publishing do anything other than frustrate?); getting there on infrastructural (though licensing, linking and standards offer the greatest challenges); struggling on operational (contractual, accuracy, standards).
That’s a one line summary. What do you think? Where should we putting more effort?