Hello, Analysis-Ready Open Data
11th June 2021
By Michael Amadi and Open Data Blend Team
Open Data Issues
UK open data has a tonne of untapped potential. However, the challenges associated with handling large data volumes and overwhelming data complexity can become a major obstacle when trying to realise its value.
Most open data is small in volume, but it is often the very large datasets that contain the most impactful and actionable insights; they are good candidates for building business intelligence and advanced analytics solutions. The size of the data alone is not the issue; it is the challenge of efficiently transforming millions or billions of data points into an analysis-ready state. This is a task that typically requires a data engineering skill set and, even for a seasoned data engineer, creating, maintaining, monitoring, and enhancing the necessary data pipelines is a considerable amount of work.
Although open, the data and context is often scattered across several semi-structured and unstructured data sources such as CSVs, Excel workbooks, Word documents, web pages, and even PDFs. It is quite common for the structure of these sources to change over time and sometimes without warning. Columns get dropped, moved, and renamed, and delimiters get swapped for something else. Add to this that the data needs to be carefully combined in a meaningful way, and you can see why a considerable amount of time is spent on data preparation rather than insight discovery.
Use Our Open Data Services
We have set out to solve the large data challenges and data complexity issues specifically for high-value UK open data. Today, we are launching two open data services: Open Data Blend Datasets and Open Data Blend Analytics. Each service provides access to three refined, high-value, and high-volume open datasets: Anonymised MOT Tests and Results, Great Britain Road Safety, and NHS English Prescribing. Over time we will add more datasets, prioritising quality over quantity.
Introducing Open Data Blend Datasets
The Open Data Blend Datasets service is primarily aimed at data engineers, data analysts, and data scientists. Our data engineering team have been hard at work building out a sustainable service that curates large open data from the UK, transforms it into analysis-ready datasets, enriches it with derived values, and makes it available through a frictionless open data catalogue.
- Openly licensed data
- All datasets are dimensionally modelled (i.e. star schemas)
- Open Data Blend Dataset UI - A fast and lightweight data catalogue user interface
- Open Data Blend Dataset API - An open access data catalogue and bulk data API that is built on open standards
- Data file downloads in CSV (Gzipped), Apache ORC, and Apache Parquet formats
- Rich metadata that includes data sources, column data types, column descriptions, useful links (e.g. to documentation from the original source)
- 99.5% up-time SLA for paid plans
Browse through the Open Data Blend Datasets.
Introducing Open Data Blend Analytics
The Open Data Blend Analytics service enables data analysts and data scientists to analyse our datasets at the speed of thought. Our business intelligence team have worked tirelessly to bring you a service that will significantly reduce your time to insights and business value. Open Data Blend Analytics surfaces our datasets through a rich semantic data model that can be connected to and analysed from BI tools like Excel, Power BI, and Tableau.
- Openly licensed data
- Predefined calculations including sums, counts, averages, and percentages
- Predefined drill-down hierarchies to support rapid report creation
- Support for BI client tools such as Excel, Power BI, and Tableau for drag and drop report building experiences
- For each dataset, at least two years of history plus the latest available year to date
- Query the data model directly using Data Analysis Expressions (DAX) or Multi-dimensional Expressions (MDX) from client tools like DAX Studio 2 and SQL Server Management Studio 18
- 99.5% up-time SLA
Learn more about Open Data Blend Analytics.