I’ll be at the launch of data.gov.uk this afternoon, and like many who’ll read this, have had a close interest – rather than direct involvement – in its genesis over the last year or so.
What is it? Simply put, it’s a first step towards delivering the government’s commitment to publish public sector data openly, and freeing it up to be reused. (This second point is at least as important as the first.) In theory, this openness will lead to many benefits, perhaps the two most significant of which are:
a. the development of useful applications; and
b. wider scrutiny of the public sector
It’s our data, after all – so the argument goes – why shouldn’t we be able to get at it in raw form and do useful things with it? What greater commitment to transparency in public affairs could there be?
So how have they done? Very well indeed. It’s been built with developers, not at them. Some excellent work has been done to set up and foster a discussion community. The frequency and depth of the postings there are a testament to this over the last six months. There’s been lots of reuse of existing standards rather than the setting up of a cottage industry to develop some new ones. It’s been done at what seems to be very low cost (I haven’t seen figures, but am aware that it was done without the benefit of expensive agencies or consultancies). It’s happened quickly, by the standards of government technology projects, and on time. Most of all, it’s just wonderful that it’s there at all – government has come an enormous distance in a very short time.
Are there any cautionary points to watch out for? A few. It’s not that clear to a visitor to the site just who its primary audience is intended to be. It is primarily for experienced application developers and those familiar with the language of data ‘in the raw’. But it will no doubt also attract the lay public who may wonder how they are supposed to use it. Jargon is used freely (though there are some good explanatory resources available if you look for them). Perhaps because it assumes a ‘geeky’ audience, it hasn’t been done to death in terms of usability – so a search for ‘crime’ returns a first result about firearms crimes in Scotland, rather than a dataset which may be more generally useful. Searching for something specific can be quite complex.
I have a suggestion about the actual content – the datasets – on the site. Not in terms of its quantity, or detail – these are marvellous. Nor over the choice of standards; I recognise that you can never please the entirety of such a diverse public information development community. No, it’s a more structural point. The term “public data” covers a multitude of diverse things. Is it about historic performance (e.g. how many fires of which type were put out last year)? Is it data about infrastructure (where all the schools are, geographically, for example)? Is it about real-time events (e.g. where has that bus got to, right now)? It’s probably the latter two that are the sexiest in terms of Really Useful Applications. Providing the datasets with some sort of categorisation like this might help to stimulate developer interest in the areas with greatest utility, but also shine a light a clear light on shortfalls in things like real-time data, if these turned out to be harder to open up (and they are quite likely to be!).
What are the overall challenges that face data.gov.uk and the free data concept on the road ahead?
James Crabtree on the Today programme this morning chose to focus on the issue of Ordnance Survey map licensing as a potential stumbling block (and has written more on the topic here). I disagree that this is the most significant strategic issue that the free data movement faces. It’s an important tactical one, of course, but one that can be overcome with instruments (such as legislation and organisational design) that we know to exist.
No, my assessment of the challenges is:
1. Will anything useful actually be produced? Do we actually have that much evidence to justify declaring this as a victory already? And here I’m measuring usefulness by the tough measure: is it used? Not in theory. Not by developers. Real use by real people and real businesses for real purposes.
2. What business models will appear when useful things are produced? Nobody works for free in the long-term, and we may find we pay for the new utility in ways we didn’t expect.
3. Where is the user need (as in the day-to-day problems the public would like to be solved) being gathered and fed into the development process?
4. And sustaining all this for the long haul? It’s great to put up snapshots of hundreds of datasets, but is it clear that they’ll all be updated regularly? If I am to develop (and market) something using this data, I want to be fairly sure that it will still exist in a year’s time.
Tim Berners-Lee spoke on the same programme this morning of the example of cycle accident data, which when published early last year led to a mash-up being spontaneously and quickly generated to help cyclists plan safe journeys. Disclosure: I happened to be the person who pressed the button to publish this data on the blog where it was surfaced (though am due no credit for its release – that belongs to unsung heroes like Richard Stirling in the Cabinet Office who did the heavy lifting with the delivery departments of government to get hold of useful stuff like this). A huge well done to him, and to the rest of the team (notably John Sheridan and Jeni Tennison, who took on the massive challenge of applying semantic web standards), and Paul Jenkins who helped to knit it all together. All under the direction of Andrew Stott, who has clearly made this a big personal priority. (Apologies to others in the team that I’ve no doubt missed.)
To recap – the only test of real success is: use. Not usefulness. Not theoretical use. Real use. Getting beyond the novelty application, the demonstrator, and the hobby lies at the heart of really untapping the potential of data.gov.uk.
I’ll raise a glass of Fentimans to the data.gov.uk crew today. But I’ll raise the whole bottle every time I see someone in their day-to-day life using what’s been generated to change their lives for the better.
Good point Paul. As you say, the next real test will be whether/when person-in-the-street realises the full impact of the opportunity we have.
A little more hand-holding between Departments wouldn’t go amiss either, for those of us not from a developer background.
A great day today nonetheless.
This is really helpful. I agree with it all but especially on 2 things. 1) Structuring data types would really help, also to include ‘cultural & educational content’ (libraries, archives, knowledge, learning resources etc) and 2) Totally endorse your question: “Where is the user need being gathered and fed into the development process?” And that could be structured in terms of different users e.g. consumers, taxpayers, researchers etc.
Make no mistake, this is a huge step. But I think you’re right: this will only be a success when it’s actually used to create something useful.
There are a couple of potential hazards that I think are worth highlighting:
1. It’s all Crown Copyright. In their FAQ they make it clear that they’re happy to allow you to use their data for commercial purposes. They’ve also made it clear that their license is compatible with the Creative Common Attribution 3.0 family, which is unexpected and very welcome. However, while the data is provided under a license rather than in the public domain (as the US does), the government retain the right to revoke that license.
Given the Ordinance Survey issues you mention above, plus the impending very likely changing of the governmental guard, this seems like scary stuff to me.
2. This is a government service and should remain impartial.
Twitter is a commercial company rather than a piece of Internet infrastructure. British competitors to the service may easily arise, and although they’d have a hell of a job ahead of them, they shouldn’t have to fight against marketing by their own government. Technologies like RSS, PubSubHubbub and OpenMicroblogging are open and unaffiliated, and I think the government should stick to those rather than providing free advertising for a particular vendor.
… But my previous comment shouldn’t diminish the effort and intent that’s obvious throughout the whole site. To reiterate, data.gov.uk is a huge step forward, and I’m just one of the many technical people who’s going to dive in and try to make something that will make peoples’ lives better.
[…] these are raw datasets. As Paul Clarke points out, the site only pays lip service to openness until someone comes along and turns these sets into […]
Agree with the categorisation issue – being geeky doesn’t mean it should be harder to find things 🙂 I am finding it hard to see if there’s anything I’m interested in.
And you’re quite right about use – but that will presumably take time for people to find things that might be of use to them in there, just for starters.
Ben Werdmuller: As far as I am aware, if you are granted something under CC-BY, you then have that work to use under that licence. If the licensor revokes the licence on that document in the future, that doesn’t affect what you can do with the work. So you can continue to distribute the work under CC-BY, as per the terms of the licence.
Hi, I’ll say up front that I spent 27 years in government in a range of IT and Information & Records Management roles – including managing the intranet developments of a large department. Currently in the private sector.
I am pretty confused about this site and its purpose. I had a quick look and every dataset I checked out took me to an exisiting governemnt web site. Is Data.gov.uk just a shopfront? If so, shouldn’t we all just go to the home department and miss out this site? Also, isn’t this just like the Information Asset Register initiative?
I can see real problems of people picking up on this information and inferring what ever they want from it and being as subjective as Government departments.
Also, on cost. If agencies cannot charge for their data, then they either have to re-coup from central government (and we all pay) or they stop collecting it.
I am not all sure that this has really been thought through by the government.
[…] launch has triggered a fair amount of buzz, and a flurry of blog posts elsewhere which do a much better job at explaining the ins and outs of the site than I have time […]
[…] Watson last year: Open Source, Open Standards and Re–Use: Government Action Plan. There is a brilliant long post by Paul Clarke on his blog, which provides some good context and outlines the next set of […]
[…] people, and let’s face it we’re all local people. If open data is to turn into some really useful applications and that are really used, addressing the real problems of real people then local data is all where […]
[…] At the same time, I do still stand by other comments I made on Twitter yesterday and previously (and by the comments of others such as Paul Clarke). […]
[…] sobering, yet darkly amusing documentary, THE CORPORATION takes its au… 3 Likes Welcoming data.gov.uk – honestlyreal 3 Likes YouTube – concernworldwide's Channel Share your videos with friends, […]
[…] Welcoming data.gov.uk – honestlyreal Good stab at making it all a little more human (tags: data.gov.uk datagovuk data opendata government gov20 strategy uk) […]
[…] people, and let’s face it we’re all local people. If open data is to turn into some really useful applications and that are really used, addressing the real problems of real people then local data is all where […]
[…] Watson last year: Open Source, Open Standards and Re–Use: Government Action Plan. There is a brilliant long post by Paul Clarke on his blog, which provides some good context and outlines the next set of […]