Go back

Data partnership’s challenge is getting government to play ball

A research council project to connect academics to public data is pushing in the right direction, but will face problems that have undermined similar efforts, says Hetan Shah.

The Royal Statistical Society’s data manifesto makes the case for the power of data to improve policymaking, prosperity and democracy. Some kinds of data have traditionally been underexploited—one example is government data that is generated through our interactions with the state, what is known as administrative data. This is not information from surveys, but the records of government administration from, for example, welfare payments or school applications.

There has been growing interest in using administrative data for research purposes. Most recently, on 27 September the Economic and Social Research Council (ESRC) announced that it would invest £44 million in a new Administrative Data Research Partnership, working closely with the Office for National Statistics (ONS). How should we assess this new initiative?

The starting point must be to look at it as an evolution of a previous project, the Administrative Data Research Network, which ran for five years from 2013. Both the ADRN and the new ADRP share the same broad goal: helping researchers access administrative data in secure settings. 

Why change the name? There’s a clue in the ADRN mid-term review published in November 2016. This says: “The Network has a major issue in that it does not have the UK-wide data that is required…to have its intended impact.” 

The network had diligent processes and high-quality structures, but it struggled to get the data out of government. This was especially true in England; the other UK nations arguably already had a stronger culture of data sharing, meaning that the ADRN was not such a step change for them. 

The new model has some clear steps forward. The strengthened role of the ONS reflects that the office has the relationships in government needed to obtain data. 

Another major improvement is that the ADRP will curate datasets. The ADRN created bespoke datasets in response to researchers’ requests, and destroyed them afterwards. This created a lot of wasted effort, especially given how difficult it was to extract data from government. The ONS will now be tasked with finding datasets that may be of interest to researchers and bringing them together. 

The context has also improved. The Digital Economy Act 2017 created a clearer route for researchers to request data from government departments. The creation of UK Research and Innovation, headed up by the force of nature that is Mark Walport, also means there are now more powerful advocates within the system.

But challenges remain. The Digital Economy Act is only permissive—it clarifies that departments can share data with researchers, but does not change the culture or incentives. And, critically, it does not include health data, which is some of the information most useful and relevant to policy. Academics are also asking whether the ONS will prioritise researchers, or if the initiative will focus on data sharing within government.

The ADRN saw the ESRC being sucked into a lot of operational issues. Given the complexity of the project, one can see this happening again. But the council instead needs to take a strategic role and focus on some higher-level questions. How can this go beyond being a social science initiative to being properly interdisciplinary? What are the right questions to ask and datasets to bring together? What can be learned from work in the health field, such as Wellcome Trust-supported work on data governance? How can the ADRP help the linking of survey data to administrative data?

A recent report by the Office for Statistics Regulation, which oversees official statistics in the UK, is a reminder that there are also major issues of organisational capability. Public services are often delivered by people who lack expertise in data, and so the quality of administrative data can be patchy. So what could be done to develop ‘administrative data by design’—such as naming conventions or consistent use of values for variables? And how does the ADRP relate to the recent ESRC skills review? Can we teach academics techniques to allow for the uncertainty embedded in administrative data right the way through their analyses? 

These are some of the issues the new ESRC strategic hub director for this project should be thinking about, but one fears he or she may get sucked into managing partnerships and delivery. 

Ultimately, the ESRC is right to keep pushing on the door of administrative data, given the potential value to research. The ADRP learns lessons from the ADRN. But knowing the pressures across government, it remains a tough challenge to extract data from departments. The project will ultimately stand or fall on this, just like its predecessor.

Hetan Shah is executive director of the Royal Statistical Society. He tweets at @HetanShah.

A version of this article also appeared in Research Fortnight