Solutions
Datasets
Download CSV, ORC, and Parquet data files.
Analytics
Connect your BI tools to our analytical query service.
Integrations
Enhance your analytics solutions with our datasets.
Insights
Interactive reports with actionable insights.
Use Cases
Learn how you could unlock value from our datasets.
Consulting
Transform our datasets into your competitive edge.
PricingAboutContact
Resources
Help Centre
Find answers to the most frequently asked questions.
Documentation
Learn everything you need to know about Open Data Blend.
Blog
Keep up to date with our latest news, updates, and thoughts.
Get Involved
Help to improve the Open Data Blend services for everyone.
Affiliates
Supplement your business with a new recurring revenue stream.
Manage Subscription
Solutions
Datasets
Download CSV, ORC, and Parquet data files.
Analytics
Connect your BI tools to our analytical query service.
Integrations
Enhance your analytics solutions with our datasets.
Insights
Interactive reports with actionable insights.
Use Cases
Learn how you could unlock value from our datasets.
Consulting
Transform our datasets into your competitive edge.
PricingAboutContact
Resources
Help Center
Find answers to the most frequently asked questions.
Documentation
Learn everything you need to know about Open Data Blend.
Blog
Keep up to date with our latest news, updates, and thoughts.
Get Involved
Help to improve the Open Data Blend services for everyone.
Affiliates
Supplement your business with a new recurring revenue stream.
Manage Subscription

Analytics on Open Data

Recent articles
Incorporating Open Data Into Your Data Strategy
19th May 2023
Open Data Blend April 2023 Update
12th May 2023
Open Data Blend March 2023 Update
14th April 2023
How Can I Do Market Research on NHS England Prescribing in Power BI?
11th April 2023
Open Data Blend February 2023 Update
10th March 2023

11th June 2021

By Michael A

Analytical Workloads

Analytical workloads can generally be divided into three main areas: data engineering, business intelligence, and advanced analytics.

Data Engineering

Data engineers are responsible for creating data pipelines that enable data consumers, such as data analysts, data scientists, and machine learning engineers to deliver insightful reports and machine learning, or artificial intelligence, models. It could easily be argued that, without some form of data engineering, getting significant value from complex or large data can quickly become an inefficient and overwhelming task.

The most ideal scenario for a data engineer when it comes to data acquisition is to be provided with a frictionless and consistent bulk data API that can be used to quickly ingest the required data into a data lake or analytics platform. To support this scenario, the API must provide metadata about when the data was last updated and include versioned data file endpoints.

In addition to a bulk data API, the data would ideally be modelled for analytical workloads (i.e. as star schemas) and made available in one or more open data file formats that are optimised for interactive querying patterns, such as Apache ORC and Apache Parquet. Once ingested, queries could be executed directly against the data files using open source data lake query engines such as Apache Drill, Apache Spark, Dremio, and Trino. An experience of this nature would significantly reduce the time to data insights.

Business Intelligence

Data analysts are responsible for surfacing insights through business intelligence (BI) reports and dashboards. Before they can do this, they must first acquire the relevant domain knowledge and transition through different phases of data exploration and data understanding. The less friction that they face, the faster they can deliver the valuable, impactful, and actionable data insights that their organisation needs.

Recent trends have seen an increase in the number of data analysts who can transform, analyse, and report on data using languages like Python, R, and Julia. This elite group of data analysts can use these languages to query data in data lakes or on their local machines. It can be argued that, ideally, most data analysts would prefer to use BI tools like Excel, Power BI Desktop, and Tableau Desktop to analyse data and build interactive reports and dashboards for their organisation. This is primarily because these tools offer simple drag and drop user interfaces that make performing these tasks very efficient.

Advanced Analytics

Data scientists typically ask for data at the most granular level available. Like the data analyst, they must have or acquire the relevant domain knowledge and progress through phases of data exploration and data understanding. Once a sufficient level of data understanding has been achieved, a data scientist will transition to a data interpretation phase where they perform hypothesis testing. Their conclusions inform the choice of the most appropriate machine learning or artificial intelligence models for their business problem.

It is not uncommon to hear that data scientists spend over 80% of their time acquiring and preparing data, and less than 20% of their time performing actual data science work. A more ideal scenario would be for nearly all of this time to be spent on the tasks that deliver the real business value.



Use Our Open Data Services


Open Data Blend Datasets

Data engineers can use the Open Data Blend Dataset API to programmatically ingest our datasets into their analytics platform. They can choose to get the data in CSV, ORC, or Parquet format, or all three, depending on their downstream data consumption requirements. Each of our datasets is accompanied with details of when the data was last updated, what the data types are for each column, and descriptions of each column to provide some additional context.

Data analysts can use the Open Data Blend Dataset UI to download data files in CSV, ORC, or Parquet format onto their local machines and analyse them in programming tools like R Studio, Azure Data Studio, Power BI Desktop, and Tableau Desktop.

Data scientists can use the Open Data Blend Dataset UI to download the data locally, or the Open Data Blend Dataset API to programmatically pull data into their analytics platform. Once the data has been acquired, they could use notebooks like Jupyter to explore the data, test their hypothesis, and train machine learning models. Because our datasets have already been optimised for analytical workloads, the data scientist only needs to perform trivial joins and light data transformations (e.g. feature engineering) to prepare the data for machine learning models.

Experience our open data catalogue first hand.

Open Data Blend Analytics

Data analysts can connect to the Open Data Blend Analytics model from BI tools like Excel, Power BI Desktop, and Tableau Desktop, and dive straight into their data analysis. Because there is no need to download or model any of the data upfront, data analysts can begin creating reports and dashboards as soon as they have a good understanding of the data.

Like data analysts, data scientists can connect to the Open Data Blend Analytics Model from any of the supported BI tools and start their exploratory data analysis (EDA). This means they can obtain the required level of data understanding at an accelerated rate. After the EDA, data scientists can download the data files, from the corresponding Open Data Blend Datasets, and use languages like Python, R, and Julia to build and train their machine learning models.

Learn more about our interactive analytics service.

Follow Us and Stay Up to Date

Keep up to date with Open Data Blend by following us on Twitter and LinkedIn. Be among the first to know when there's something new.

Blog hero image by UX Indonesia on Unsplash.

Got feedback?
Get involved.
Get our latest updates
We'll use the information you provide through this form to send you Open Data Blend related news and updates. View our privacy policy
Operated by

Copyright © 2019-2023 Nimble Learn Ltd. All rights reserved unless otherwise stated. Company Registration Number 08637310. VAT Number 174 9728 60.

Open Data Blend®, the Open Data Blend® logo, Nimble Learn®, and the Nimble Learn® logo are registered trademarks of Nimble Learn Ltd. All other product names, logos, and brands are the property of their respective owners, and their use does not imply endorsement.

Terms    Privacy    Cookies    SLA    Licensing    Docs    Status