Tag Archive for Open Data

Are universities transparent enough?

FOI Man talks to Times Higher Education about universities and openness.

Times Higher Education magazine this week features an article about…higher education, and how open and transparent it is. I was interviewed for this feature a few weeks ago – wonder at my high rhetoric – “[FOI is seen] as a pain in the backside”. Seriously, it’s a comprehensive survey of all aspects of transparency in the UK university sector, including everything from FOI to open data to MOOCs (that’s massive open online courses for those of you not in the know).

Opening up Open Data

FOI Man reports on a visit to meet Open Data experts at Southampton University.

Chris Gutteridge is a techie – in the best possible sense of the word. When I first arrive, he’s eager to show me “the cool stuff”. And it really is cool.

Despite my pseudonym, I don’t have superpowers, yet Chris is able to fly me over and into the university campus, bringing us to the ground with a bump outside his building. Of course, I’m talking about a visual representation of the campus achieved through the neat trick of linking building data collected from the university estates department to Google Earth. Chris is experimenting with making the maps 3D by collecting (and in some cases creating) data on the height of buildings on campus.

And the cool stuff like this is how he sells the open data initiative to colleagues across the university. Central and local government have already made great strides in the open data arena, but Southampton are pioneers – and indeed, award winners – in the higher education sector. After all, it is the home of the Government’s open data tsars (can you have two?) Professors Nigel Shadbolt and Tim Berners-Lee (also famous these days for sitting behind a desk entertainingly at the Olympic opening ceremony – no mean feat). But even with this top level support, it can be tricky to get busy departments to cooperate, as many FOI Officers will sympathise with.

Chris starts listing the objections colleagues raised when asked to provide datasets to make available for re-use through their Open Data Service. He looks puzzled when I start laughing, but some of you will recognise “what about data protection?”, “what if terrorists exploit it?”, and “isn’t it commercially sensitive?”. And the question that often lies behind these objections in reality, “what if someone realises our data is unreliable?” (or “shit” as Chris rather more prosaically puts it). Chris has blogged a whole list of these concerns and they all sound rather familiar.

But by demonstrating that once the data has been collected centrally it can be made useful to the departments that originally provided it, Chris and his team have started winning them over. One of their biggest supporters is the Catering department, who already maintained spreadsheets with details such as the items available in the cafes, bars and restaurants across campus and their unit cost. With a little adjustment, the spreadsheets were made into reusable data. Now the Catering department no longer have to manually update their web pages as the availability and cost of food and drinks in individual outlets is now pumped live from regularly refreshed open data straight to the pages. They’re now doing similarly useful things with events calendar data.

Chris thinks this repurposing of the open data is key to making open data a success. His mantra, repeated to me several times, is that the main return on investment in open data for the university is the availability of “a huge pile of data that can be used internally without fear”. And if outside users (or indeed their own students) find the data useful to create new apps, then all the better.

Chris gives me some tips for anyone wanting to establish an open data repository. Given that recent FOI amendments mean that this will soon be a requirement, that’s pretty much all of us in the public sector.

Avoid controversy. You want to get buy-in from colleagues, so don’t startle the horses. Pick simple but useful datasets that nobody will challenge. At Southampton they started out with building data. It’s data that isn’t sensitive – people can see most of it by walking around the campus – but as we saw at the start of this piece it can be put to very effective use by developers.

Think carefully about what datasets you think should exist. Then speak to the relevant departments and see if they do have those datasets in a useful format. Chris suggests that often suppliers can be very helpful, and can give advice on how datasets can be extracted from their systems. They may even be prepared to look at how their systems are specified so that datasets can be more easily exported in future.

Encourage colleagues to give you their data even if it isn’t complete. It is better to have some data than no data at all in most circumstances.

Be ready to challenge concerns. Some colleagues will be concerned about giving data away for free, rather than making money from it. But as Chris points out, “if a Chinese takeaway gave away its food, it would soon go out of business. But if a Chinese takeaway didn’t give away its menus, it would go bust even faster.”

Look at what you’re already making available, and how. Chris demonstrates that my university is already making available data in a reusable format because we have an online repository called EPrints. He tells me that if you have an RSS feed on your website, you are “already more or less doing open data”.

Make information available in a reusable format. Open data enthusiasts grimace at the mention of portable document format (pdf). At the very least, try to make data available in a spreadsheet format.

Adopt an open licence. Open data is about more than publishing information in a reusable format. It is also about licensing. It’s really only open data if you state on your website/open data repository that data can be reused without charge. The best way to do this is to adopt the Open Government Licence.

Keep datasets up-to-date. One of the things that public bodies will be expected to do once the FOI amendments on datasets come into force is keep published datasets up-to-date. I ask Chris how Southampton maintain their datasets. It depends, of course, on where it comes from.

  • Hearts will sink to be told that a lot are maintained manually by his team – not many of us will have resource to spare. So to make this work we need to ensure that it’s as easy as possible to get things corrected. Chris suggests encouraging feedback from users of the data so that they can flag up when data needs refreshing. At Southampton they also use student volunteers to maintain the datasets, which might be something to consider for universities in particular.
  • Some datasets automatically update. Depending on your set up, some will be fed live data from systems maintained by the supplying department. Some suppliers who provide systems across the public sector are already thinking about how to build open data publication functionality into their databases.

Chris is keen to encourage the growth of an information ecology, with others across higher education publishing more and more open data. He encourages universities to consider creating ‘profile’ documents on their websites, describing where their key datasets can be found and in which formats. This will help with auto-discovery by the new open data hub for higher education, which will eventually provide potential users of datasets with a single portal to locate useful data.

So the higher education sector isn’t moving into this new era of reusable open data entirely unprepared. And if you’re thinking of taking your first steps into this brave new world, hopefully you, like me, feel slightly less daunted thanks to Chris’s enthusiasm.

I’d like to thank Chris, Ash and Patrick for letting me disturb them for a whole afternoon last Friday. Any errors here reflect my ignorance and not their skill or willingness to share!

Draft Datasets Code of Practice

FOI Man highlights a new draft Code of Practice under section 45 of the Freedom of Information Act.

It’s all go with FOI at the moment. No sooner have we had to wade through the ICO’s Anonymisation Code of Practice than another comes along from the Ministry of Justice – this time a draft Code setting out best practice for meeting the new requirements under FOI relating to datasets.

The draft Code is a supplement to the existing section 45 Code of Practice, setting out best practice for public authorities in complying with FOI. It is required by the amendments made to FOI by the Protection of Freedoms Act (which are not yet in force).

It provides clarification on interpreting the definition of dataset in the amendments, as well as setting out the three licences (developed by The National Archives) that public authorities will be expected to use when licensing re-use of datasets (ie open, non-commercial and charged). What isn’t yet clear is what fees public authorities will be able to charge for re-use. The amendments allow for the Secretary of State for Justice to lay down regulations to allow this, but there is no news yet on if, or when, such regulations will be forthcoming.

It should be stressed that the Code is a draft, and the Government is inviting comments on it via the gov.uk website. So if you’re interested in the open data agenda, or simply want to ensure the Code is clear enough, do go and make your views known.


FOI Man reports on the ICO’s new Code of Practice on anonymisation.

FOI Officers tend to be caught between a rock and a hard place on a pretty much continual basis. If it isn’t navigating between the Scylla of senior management and the Charybdis of requester ire, then it’s trying to balance the often competing demands of the Freedom of Information and Data Protection Acts (DP).

So new guidance from the Information Commissioner on the important subject of anonymisation is very welcome. Though at over 100 pages, some FOI and DP Officers may struggle to find the time to read it between fielding requests and CMP notices. But, ever at your service, I attempt to extract the key points for you here.

The Code notes DPA does not require anonymisation to be completely risk free – the role of the Code is to help organisations mitigate the risks involved with anonymisation. Similarly, it points out that – in line with R (on the application of the Department of Health) v Information Commissioner [2011] EWHC 1430 (Admin) – anonymised information ceases to be personal data. So if your data is truly anonymised, section 40 of FOI won’t apply to it, and the sort of large datasets that that nice Mr Maude likes Government departments to publish can be unleashed without concern.

But that’s the trick. We’ve got to be very careful that what we put out there is truly anonymised. The Code summarises the problems with that neatly – firstly, there are a number of ways that an individual could be identified, so just taking a name out may not be enough. And secondly, we have no way of knowing what information you folks out there might already have access to.

There are well documented examples of how individuals have been identified from supposedly anonymised datasets once put together with information available on the internet or with personal knowledge. The ICO point out that organisations aren’t omniscient – they can’t know for sure what is, and what will be, available to people. So what do they say about how FOI and DP Officers should reach the judgment as to whether or not it is safe to disclose an anonymised dataset?

Effectively – and I hate to throw a buzz word at you – it’s a risk assessment. They cite a Tribunal concept of the “motivated intruder”. Basically this is someone who will do anything short of commit crime to identify individuals where there is some motive, eg the information is newsworthy, of interest to the village gossip, perhaps politically sensitive. We need to consider whether someone like that could identify people using libraries, archives, the internet, social media. In other words, we’re talking about those people who you see on TV sometimes tracking down people for an inheritance. Or the producers of Who Do You Think You Are. Could they identify individuals from the data?

Of course, this is better than nothing, but it still relies on FOI and DP Officers or their colleagues to have the time to work out whether someone could be identified from all of these sources. If they haven’t got that time, then there is a risk that the Code just leaves us where we started – with authorities reluctant to release information for fear of individuals being identified.

Thankfully the ICO do recognise the difficulty of this with large datasets – the desire for publication of which is pretty much what prompted this Code. They say:

“It will often be acceptable [with larger datasets] to make a more general assessment of the risk of prior knowledge leading to identification, for at least some some of the individuals recorded in the information and then make a global decision about the information.”

But it still means that many FOI and DP Officers will be left feeling uncomfortable whenever considering disclosure of anonymised datasets. Have I checked enough sources? What if I’d tried that other search engine? Should I subscribe to that genealogy site to check what someone could find there? It’s difficult to see what else the ICO could have advised, but FOI Officers will take limited comfort from the Code on this point.

There is some useful practical advice in the Code such as the best ways to present personal and spatial data (eg in crime maps). The case studies that form the last half of the publication will be helpful as well.

Overall, the Code is a useful guide to the issue of anonymisation for FOI and DP Officers and anyone working with datasets containing personal data. But it won’t be the last word and it will be interesting to see what comes out of the new UK Anonymisation Network announced yesterday by the Information Commissioner.


Prime Minister comes out against FOI

FOI Man puts his head in his hands over Prime Minister David Cameron’s latest comments on FOI.

Earlier today the Minister for the Cabinet Office, Francis Maude was once again calling for more transparency in a talk to delegates at the Information Commissioner’s Data Protection Officers Conference. He gave the usual speech about the value of transparency, good for the economy, how it got governments “out of their comfort zones”. Yada yada yada.

As usual, there was no mention of FOI. It always seems odd to me that with the Cabinet Office embracing transparency quite so warmly, there is little mention at any time of the piece of legislation that has arguably done most to facilitate Government openness.

Also today, you may have spotted that the Government’s favourite Think Tank, Policy Exchange, published a report on transparency and open data. But once again, very little mention of FOI.

And now we know why. Because right at the top of Government, the man in charge thinks we’ve got it all wrong (the relevant bit is about 5 minutes from the end). FOI isn’t about what we want to know about. “Real freedom of information,” says David Cameron, “is the money that goes in and the results that come out”. We’re looking “through the wrong end of the telescope” apparently, wanting all this information about the process of governing and making decisions. And it’s “furring up the arteries” of Government.

The Government’s transparency agenda is great. I’m certainly not going to complain about it, and I’d encourage FOI Officers everywhere to see if they can get involved with it. But how the Prime Minister described FOI is exactly why we should have the general right of access (as it’s called) under FOI. We no longer live in a society where people are satisfied with being told “here’s what we the people running the country are prepared to give you – now go away and amuse yourselves with your iphone gadgets and wotnot while we get on with the important work”. Transparency is not enough if it means being grateful for what we’re given. True transparency allows individuals to interrogate their Government and other public bodies.

Some people think, I’m sure, that I’m making too much of all this FOI stuff. But it’s important. Let me explain why.

FOI is a way for individual people to take part in politics. Every election in recent history has prompted a debate about how people can be more involved in democracy, how can we get more people interested? How can we get people voting? Yet right here we have a mechanism which is used by real people – individuals (who in fact make most of the FOI requests, whatever some would like to suggest about the media and business) – who are engaging directly with public bodies to find out what they want to know. And what happens? There’s a post-legislative scrutiny and public bodies and politicians queue up to say that their interest in public life is too expensive and inconvenient.

It’s not just about individual people being able to ask questions and get answers, though. It’s about providing a further check and balance on those in power. Put simply, many eyes are better than few.

And some public bodies really do benefit from that extra scrutiny. Take the Greater London Authority, where I used to work. The GLA, as many of you will know, is the home of the Mayor of London. There has only been a Mayor, and a GLA, for the last 12 years – the whole thing was a creation of the Labour Government. The Mayor is supposed to be held to account by the London Assembly, a group of 25 elected members. But in effect, the Assembly has always had limited power to rein in the Mayor, not least because of its party politics. In that vacuum of accountability, FOI played an unintended, but essential role in keeping the Mayor and his appointees in check. They didn’t like it (either of the administrations), but it worked. They knew they were being watched, and when they did stupid or controversial things, FOI meant that people could find out about it. And with more councils moving to directly elected Mayors, that’s a lesson that others should learn from.

That experience confirmed me in the view that FOI can be, and should be, a powerful tool in governance of the public sector. What I find sad about the comments from David Cameron today, and those of his predecessor Tony Blair, not to mention Gus O’Donnell and the many council leaders who have attacked FOI, is that if they got behind the legislation, insisted that the public sector had to accept it and adapt to embed it in its processes, then it really could work very well. Public bodies would be more efficient because the information would flow inside them more effectively. There would be less security breaches and leaks because public bodies would be able to focus their attention on the most sensitive data. People really would start to have more trust in government at all levels because public bodies treated them with respect by answering their questions without grumbling.

But these benefits will never be fully garnered. Too many politicians and public servants have a blind spot about FOI. David Cameron, elsewhere in his session with the Liaison Committee, talked about the importance of accountability of public bodies to individuals – schools to parents, hospitals to patients and so on. Yet he can’t see that FOI offers that, and that by attacking it, he is, in effect, contradicting himself. I’ve seen the same thing happen with perfectly reasonable colleagues. They believe in public service and being accountable. But they get an FOI request to deal with and they start frothing at the mouth and panicking about how to answer it. Even when the answer is perfectly straightforward and the actions taken that are the subject of the request are utterly reasonable.

So I fear that even if our Save FOI campaign works, and we avoid the Act being watered down, FOI will continue to be an add-on in most public servants’ eyes. We FOI Officers will struggle on in the face of begrudging compliance from colleagues. We’ll have to defend something that we shouldn’t have to defend because our so-called leaders refuse to accept the will of Parliament and make it clear that answering questions is an integral part of providing a public service.