Towards trusted data sharing: guidance and case studies
Implications for policy
Barriers to data sharing
Barriers to data sharing – and to the emergence of ‘data ecosystems’ – are widely recognised. They include:
- the difficulty of creating clear contractual agreements regarding ownership and use of the data and the allocation of value
- technical issues such as ensuring adequate data quality, and enabling data integration or linkage. This are discussed further in the section on practical challenges
- legal and regulatory issues including around what data can be shared [15]
- rights management, which is of particular importance, especially where the data propagates and attribution is necessary once datasets have been combined and sent on.
The case studies illustrate that some very important work has already been done within specific projects on vital areas such as reference data architectures, mechanisms to enforce constraints around how data is shared and used, and methods for preserving personal privacy. Other challenges include enabling viable business models and ensuring interoperability of data and data platforms. However, not every challenge has been fully solved.
Trusted data sharing
Mutual trust among both big and small players is important, and therefore mechanisms that enhance trust among all transacting parties are vital. These mechanisms should enable transparency and control. [16] They go beyond establishing intellectual property rights and securing data-sharing agreements and include: technologies and architectures that facilitate security or privacy; robust processes for checking quality; mechanisms for ensuring the various parties are compliant with terms and conditions; and ensuring various parties having the skills and resources to deliver their parts of the agreement. A user-centric approach, with easy-to-understand tools and policies, also helps to facilitate participation and enable trust. Trust mechanisms need to be put in place as quickly as possible.
The development of robust and trustworthy frameworks for sharing data would make these practices more acceptable to companies and other types of organisations, and the public – a key consideration if personal data is being shared.
The concept of a ‘data trust’ has emerged as a possible mechanism for enabling trusted data sharing, although its precise definition and extent of its applicability is a matter for debate at the time of publication. Data trusts could encompass some combination of legal terms, governance arrangements and technologies used to access data, or it could purely be a legal structure that sets out the relationships between the different parties. [17] Data trust pilots will help to refine the definition and test the practical implications of the concept.
Central hubs for data sharing
The case studies demonstrate that the use of a central hub can facilitate trusted data sharing. The central hub needs its own business model and governance framework to be sustainable and to underpin trust. It may help to catalyse the formation of an ecosystem of data owners, providers and consumers, alongside third parties who may take on myriad roles such as data brokering, data cleansing, data aggregation, certification and analytics.
The development of an ecosystem in which a range of players can participate in the governance is preferable to one that is governed by a single big player.
Data sharing agreements
The development of standardised terms and conditions will minimise the resources needed to develop data-sharing agreements between parties, as long as they reflect a standardised or harmonised understanding of data usage rights and responsibilities. Such a harmonised understanding is vital, so that ‘two or more parties in any sector can partner in data sharing agreements, shape the agreements according to their needs and enable multiple organisations to work together to solve a common problem’. [18]
Data agreements must address a broad set of issues including: data quality, timeliness and lifecycle; compliance with the governance rules; and enforcement of usage rights. [19] Defining the requirements for data quality, and ensuring these requirements are delivered, remains a central challenge.
Data ecosystems
Data ecosystems where data is pooled between organisations are found to have a number of features: [20]
Features of data ecosystems
(a) clearly defined boundaries that enable the identification of a legitimate user;
(b) rules regarding how data resources can be used offline or outside the transaction;
(c) opportunities for contributors to participate in the development or improvement of the platform;
(d) effective monitoring by a group of core users or a third party accountable to the core users; and
(e) rules that define how the resources are to be used and the penalties for misuse.
In addition, an independent oversight body would ensure that individual organisations pooling their data have a measure of confidence in the running of the overall system. These attributes are likely to extend beyond data pooling models to other models of data sharing, such as those illustrated by the case studies.
Government’s role
Government can play several roles in helping facilitate the data-sharing ecosystem. It already plays a role in setting standards and making government data accessible and usable. Government can also encourage private sector organisations to collect and release data, leading by example and sharing best practice. It can improve access to the skills that are required for identifying opportunities for data sharing, and for developing and implementing data sharing models.
This requires business skills, and the soft skills required to lead, be part of multidisciplinary teams and work in partnership. It also requires technical skills for developing appropriate architectures and applications of technologies, and for data engineering and linkage. It can create the parameters as a regulator, and fund platforms and infrastructure for data exchange, as well as pilots for new models of data sharing. [21]