Towards trusted data sharing: guidance and case studies
Data sharing checklist
7. Identify the architectures and technologies needed to enable data sharing
- What architecture is appropriate? What is the processing model: will data be processed where it is stored or moved elsewhere to be processed?
- What platforms and other technologies will be required to create, store, transfer or process the data?
- What are the non-functional requirements for technology design, such as volume and growth rate, frequency of change, security needs, availability needs and performance needs? What timeliness of access is required, including real-time access?
- What technologies are available that allow data sharing in a controlled way? What technologies and methods enable effective data curation; for example, data tagging, standards, security or other methods?
- How do technologies help address commercial and regulatory constraints?
- How do the technologies help deliver the business model? How do the technologies affect the terms upon which the data is shared and what are the broader business implications?
Learning from the case studies:
Various technologies and architectures can help to enable trusted data sharing. Some are well established, while others are still being developed and tested. Several the case studies are current or recent R&D projects developing new architectures and technologies such as Databox, oneTRANSPORT, MK:Smart, CityVerve Manchester and Industrial Data Space.
A strong partnership can influence the success of innovative technological solutions in development, as has been illustrated by the CityVerve Manchester and oneTRANSPORT projects. The relationship between technology provider and client needs to be more collaborative, with both sharing the risk.
Using standards that enable interoperability and data portability enhances data value. For example, Hypercat was used in the CityVerve Manchester project to support a integrated approach to data hubs, while oneM2M was used in the oneTRANSPORT project. There are many overlapping and conflicting standards that are siloed and fragmented so agreement needs to be supported.
It would benefit organisations to identify and implement appropriate management standards for data hubs early on, such as standards for information security management systems, metadata and governance.
It is important to not underestimate the resource required to ensure quality and integrity of data, including maintaining real-time data feeds with automatic real-time data validation. Organisations should ideally automate processes where possible, as done by The Weather Company, and create an approach that can deal with uncertainties in data quality. Data quality was one area of investigation in the MK:Smart project.
Technologies can help with enforcing usage rights and restrictions, enabling trust between organisations sharing data. Several of the projects, including Industrial Data Space and CityVerve Manchester, are developing such technologies.
Rapid prototyping and agile software development, along with early pilots, are useful to demonstrate benefits early on, which is the approach that will be used by the Data and Analytics Facility for National Infrastructure (DAFNI).