Towards trusted data sharing: guidance and case studies
Case study 8
The Weather Company:
providing data services for sectors where weather has an impact
The Weather Company (TWC), an IBM business, is a data-centric company delivering insights to support decisions. It services many different sectors where weather has an impact, including transport, oil and gas exploration, energy and utilities, automotive, finance, agriculture and retail. For certain sectors, data services are critical to their profitability. Other customers are developing revenue streams out of data. TWC has a variety of businesses for which data monetisation is fundamental. These include both business-to-business (B2B) and business-to-consumer (B2C) companies. Various types of weather forecast are generated and supplied to many different users and weather data can be blended with other types of data to create insights for customers. TWC also provides bespoke services to help customers generate insight from their own data.
Summary: the eight dimensions of data sharing
Using weather data combined with other sources of data to support decisions made by organisations. Diverse opportunities exist in many different sectors and applications.
The data and its use:
TWC collects weather data plus application-specific data, which has multiple uses.
The business model and value creation:
The company has various business models, including a B2C business where data is supplied free and revenue is from advertising, a B2B business where weather data and related insight are delivered to organisations and a bespoke business.
The model for data sharing and the partnership:
TWC has a common platform for all weather data and other types of data.
People with the right skills and expertise:
Data scientists, business development managers and lawyers need to work together closely. Typically, TWC provides platform and analytics expertise and meteorologists, while customers provide knowledge of their business.
Constraints on how data is shared and used:
Constraints include GDPR and commercial sensitivity.
The data architectures and technologies:
A shared ‘data lake’ with a catalogue for data product and analytics development with a common governance model.
Governance / oversight / enabling trust:
The company has a mature and flexible governance capability to deal with different business experiments and developing terms and conditions. Governance underpins all activities and is beneficial to the business, for example processes for checking quality and provenance.
TWC has several different businesses. One well-established business uses a global data platform to deliver weather information to many smart devices through various free apps - these include Weather, Weather Underground and Storm Radar, as well as systems built into cars. The platform absorbs very large amounts of data from personal weather stations, satellites and national weather infrastructures. It cleanses and blends this data through a combination of weather prediction algorithms to produce the different types of weather forecasts consumed by different sectors. The free mobile apps such as Weather, Storm Radar and Weather Underground are a B2C business that is based around advertising. Growth is achieved by bringing more people to the apps and understanding context about their usage such as location and time of day, since both increase the advertising revenue.
In addition, there is a B2B business delivering weather data and related insight to organisations. In this business, TWC does not just sell raw weather data. It sells the output of blending additional data and pushing it through analytic algorithms to directly support its customers’ business decisions. This business spans many industries, including transport, oil and gas exploration, energy and utilities, automotive, finance, farming and retail. For example, in aviation, airlines use TWC services for planning flights - weather data allows airlines to better estimate how much fuel to load.
In agriculture, farmers can better understand when crops are stressed. In energy and utilities, weather data is used in outage prediction and recovery operations, in order to reduce the cost of storms and equipment failure.
Finally there is the more bespoke business where a single company asks for help generating insights on their own data. TWC provides a managed cloud service to host the development and operation of its data-driven services, combining the company’s data with IBM internal data, open data and possibly third-party data. There are different models of revenue generation in this business. It can be a one-time contract to help a company understand the value of its data and where to create value from it; TWC can augment data that a business partner continuously shares; or TWC can provide a data service to a business based on a blend of the company’s data, IBM’s data and IBM’s analytics algorithms.
Both the B2B and bespoke business do their development work in a shared data lake. Data services are delivered on the same platform as the B2C business. However, they still provide isolation for private client data, even in these shared environments.
“A variety of different business models have been developed, which depend on the customer. They include advertising, benchmarking, value exchanges and sale of insights”
In the B2C and B2B businesses, data monetisation is a fundamental part of the business, so there are dedicated managers developing data-oriented products. The bespoke business is more opportunistic and ideas come from many sources.
TWC works with both customers and business partners, and there is a mixture of data and value sharing between them. There are a variety of different business models, which depend on the customer, covering advertising, benchmarking, value exchanges and sale of insights. For sectors such as agriculture and aviation, the data services are critical for them to be profitable. Some customers are beginning to create additional revenue streams out of their data and thus a greater number of value-based pricing of the data services could emerge in the future.
For example, vehicle manufacturers are considering how they can monetise the data they gather from intelligent and autonomous cars to build additional revenue streams.
Data products go through a similar process to software products – there is an idea that is tested, development work done and piloted. Feedback from pilots leads to further development or in some cases the project may be stopped. The product is rolled out as a small pilot, after which the consumer base is grown. Data scientists, business development managers and lawyers need to work together closely. Typically, TWC provides the platform and analytics expertise and meteorologists, while customers provide knowledge of their business.
“Data sets are curated in as automated a way as is possible”
Technical and data curation arrangements
The platform is proprietary and has been built using open source software augmented with IBM governance products and internal components. It can accommodate real-time data and continuous feeds, such as radar data, which are necessary for weather prediction. Data about the habits of consumers comes through mobile devices and is used to support the advertising business. For most B2B business, data about weather, location and human data is blended to deliver specific value to each industry. Raw records, organised datasets or other forms of information are all used. All new data is catalogued, and GDPR is driving the cataloguing of historic data.
Data sets are curated in as automated a way as is possible. When a data scientist identifies a new source of data, a ‘pipeline process’ follows whereby data is catalogued, its provenance is captured and quality checks are run on the data. First, data enters a ‘quarantine zone’ where is it manually checked and then verified against Terms and conditions (T&Cs). If it passes these checks, the data scientist and others at TWC can access it if they have been granted permission. When new data is derived, or external data is to be used in a product offering, there is a more complex approval process involving lawyers, business development and data scientists, particularly before commercialisation.
Where data is sourced from private weather stations, much of the data needs to be validated for quality by cross-correlating it with other sources – for example, the presence of trees or buildings may affect measurements. Checks are also carried out on data from customers to ensure that it is free of personal identifiable information.
Standards are used for commonly shared data, such as location. However, much commercial data is proprietary and is unique to the system that created it.
The data catalogue and associated processes are essential to the business model. The data platform has been engineered to support large volumes of data, be resilient and able to scale up when big weather events lead to high demand on weather services. The shared data lake for data product and analytics development provides a common platform for governance.
Security is ensured by having dedicated experts carrying out monitoring to detect data breaches and penetration testing. In addition, products cannot go into production unless they have the right security controls in place.
Legal and commercial arrangements
T&Cs on data sharing is a critical part of contract negotiation, for acquiring data and insights as well as selling it. The governance capability at TWC is mature and flexible to cope with different business experiments.
A two-way agreement is required, with obligations and permissions on both sides. The agreement covers the data lifecycle and quality. The quality of data must be appropriate for its use, and the agreement addresses the data that it is blended with and its timeliness. Insufficient data quality may cause a new venture to fail. Data checking is automated wherever beneficial. Sometimes the contract is an agreement to exchange data between parties and no money changes hands. Contracts have Service Level Agreements (both internal and external) and the company operates legally within those parameters.
Data governance underpins all activities and is beneficial to the business. Individuals work more effectively if they operate within the governance processes because they are designed to improve efficiency. Strong controls exist on new data acquisition, particularly if it forms part of a contract with a customer. More lightweight curation is necessary for data science experiments. However, as these move closer to commercialisation, checks and processes become more stringent. The data lifecycle is managed through the catalogue. GDPR will require significant amounts of new infrastructure.
“Several elements of the platform have helped to enhance trust, such as standardised terms and conditions, and quality measures”
Outcomes and lessons learned
- Often companies do not know the value of their data. Business leaders and decision-makers need to have a mature understanding of the value of data they hold; for example, there have been situations where a sales team is focused on selling hardware and physical services and gives away the data for free, even though the data (or any insight that can be derived from it) is the most valuable asset. It is important to treat data as both an asset and a liability, and an ethical approach and respecting T&Cs are fundamental aspects - even open data has T&Cs. It is challenging to establish ownership rights on derivative work and to understand the chain of custody. There needs to be lateral flow of data and the associated coordination, rather than a siloed and command and control organisational structure. For example, one part of business may need to incur the cost of collecting data to benefit another part of business; even inside a business T&Cs may be required.
- The value of the data depends upon its use and not the cost of collecting or processing it. A business makes a choice about whether to sell raw data, augmented data, derived data or a solution; the higher up in the pyramid, the higher the value, effort and risk. Data has a lifetime that can be short - this is particularly true in the advertising business, for example and this has to be factored into the pricing. It is challenging to ascertain what the value of data should be in advance, but a piloting stage can be used to validate an approach. TWC helps customers to explore the monetisation of their data.
- The barriers to participation are unique to each industry. For example, healthcare has its own unique regulations. Different industries have different views on the value of data and the conditions under which data should be collected and used. Several elements of the platform have helped to enhance trust, such as standardised T&Cs and quality measures. It is important to get a people-focused governance programme in place as soon as possible.
- “Always respect data – like the ocean, it can bite you … as well as serve you.”