Go back

Make data reuse a habit

                 

EU strategy needs to do more than provide infrastructure, say Daniel Spichtinger and his colleagues

Data is not the new oil, as is sometimes claimed, because, unlike oil, it can be reused. In fact, this reusability is precisely what makes data a key resource of the 21st century. A more apt comparison may be with renewable energy. 

Facilitating the reuse of data is one rationale for the EU’s data strategy, a bundle of measures aimed at creating a single market for data and at ensuring the bloc’s global competitiveness and data sovereignty.

The strategy, launched in 2020, contains legislative measures to set up a cross-sectoral governance framework for data access and use: the Data Governance Act, the Digital Markets Act, the Open Data Directive and the Data Act, each having their own target groups.  

But its most concrete ambition is to establish Common European Data Spaces in different thematic areas. These are intended to bring together relevant infrastructures and governance frameworks to facilitate data-pooling and sharing. 

The aim is to allow data from across the EU—from the public sector, businesses and other types of organisations, as well as individuals—to be made available and exchanged in a trustworthy and secure manner. 

Establishing such data spaces promises significant benefits, it but also raises many challenges. For researchers, one of the most serious is the narrow interpretation of the EU’s General Data Protection Regulation put forward by the European Data Protection Supervisor and the European Data Protection Board in their Joint Opinion on the proposed Data Act released last May. This risks further weakening the research exception of the GDPR, making the collection of research data even more challenging than it already is.

Even if legal and technical hurdles are overcome, it is simplistic to assume that just making data available will automatically result in widespread reuse by researchers. Rather, we need to better understand the factors that influence whether and to what extent researchers make use of data that are shared. 

Listening to reusers

To better understand these factors, we interviewed 12 researchers who had reused data and 12 intermediaries such as publishers and repository managers who work to make data accessible.

In a recent paper, we show that some of these factors are specific to projects: researchers’ trust in a particular dataset’s quality, its suitability for purpose and whether it meets the ‘Fair’ principles of being findable, accessible, interoperable and reusable. Other factors are independent of individual projects, namely researcher attitudes, community norms, rewards and requirements. 

Our interviews also illuminated whether and how researchers develop a habit of reusing data and come to see this practice as part of their professional identity. Researchers described this process using a variety of terms, like the development of a reuse mindset, while intermediaries often spoke in terms of developing researchers’ awareness of the possibilities and potential of data reuse. 

For some, this awareness was linked to using a particular product or service, such as a specific repository. For others, it was a question of creating purpose-built settings where researchers can work out how and under what conditions they can benefit from data reuse, as well as grapple with its consequences. One example is the Lab for Open Innovation in Science at the Einstein Center for Neurosciences in Berlin. 

Whether researchers actually use the planned data spaces will be an important measure of the success of the EU data strategy as a whole. Thus, encouraging researchers to consider data reuse as a modus operandi of scientific research—in other words, to make reuse a habit—should be a key priority. Otherwise, data spaces risk becoming data graveyards. 

In the area of health, the EU-funded project Towards the European Health Data Space, which seeks to develop principles for the secondary use of health data, could be a vehicle for building such habits into the social infrastructure of the European Health Data Space.  

Paying attention to how researchers understand themselves and their work would help ensure this data space is fit for purpose, while at the same time providing a proof of concept for other areas in which data spaces will be set up. On a broader, cross-thematic level, the European Open Science Cloud can potentially also play a role by building safe spaces for experimenting with data reuse into its activities, such as outreach actions. 

We realise that adding habit formation—and settings that enable it—to the issue of data sharing adds another layer of complexity, but without it data spaces won’t fly.  

Daniel Spichtinger is at the Ludwig Boltzmann Society in Vienna. Marcel LaFlamme is open research manager at the Public Library of Science. Marion Poetz is in the Department of Strategy and Innovation at Copenhagen Business School 

This article also appeared in Research Europe