Solutions
Integrations
Enhance your analytics solutions with our datasets.
Datasets
Download CSV, ORC, and Parquet data files.
Analytics
Connect your BI tools to our analytical query service.
Insights
Interactive reports with actionable insights.
Use Cases
Learn how you could unlock value from our datasets.
PricingAboutContact
Resources
Help Centre
Find answers to the most frequently asked questions.
Documentation
Learn everything you need to know about Open Data Blend.
Updates & Blogs
Keep up to date with our latest news, updates, and thoughts.
Get Involved
Help to improve the Open Data Blend services for everyone.
Affiliates
Supplement your business with a new recurring revenue stream.
Manage Subscription
Solutions
Integrations
Enhance your analytics solutions with our datasets.
Datasets
Download CSV, ORC, and Parquet data files.
Analytics
Connect your BI tools to our analytical query service.
Insights
Interactive reports with actionable insights.
Use Cases
Learn how you could unlock value from our datasets.
PricingAboutContact
Resources
Help Center
Find answers to the most frequently asked questions.
Documentation
Learn everything you need to know about Open Data Blend.
Updates & Blogs
Keep up to date with our latest news, updates, and thoughts.
Get Involved
Help to improve the Open Data Blend services for everyone.
Affiliates
Supplement your business with a new recurring revenue stream.
Manage Subscription

Analytics on Open Data

Recent articles
Open Data Blend May 2025 Update
10th June 2025
Open Data Blend April 2025 Update
9th May 2025
Open Data Blend March 2025 Update
11th April 2025
Open Data Blend February 2025 Update
10th March 2025
Open Data Blend January 2025 Update
7th February 2025

11th June 2021

By Michael A

Analytical Workloads

Analytical workloads can generally be divided into three main areas: data engineering, business intelligence, and advanced analytics.

Data Engineering

Data engineers are responsible for creating data pipelines that enable data consumers, such as data analysts, data scientists, and machine learning engineers to deliver insightful reports and machine learning, or artificial intelligence, models. It could easily be argued that, without some form of data engineering, getting significant value from complex or large data can quickly become an inefficient and overwhelming task.

The most ideal scenario for a data engineer when it comes to data acquisition is to be provided with a frictionless and consistent bulk data API that can be used to quickly ingest the required data into a data lake or analytics platform. To support this scenario, the API must provide metadata about when the data was last updated and include versioned data file endpoints.

In addition to a bulk data API, the data would ideally be modelled for analytical workloads (i.e. as star schemas) and made available in one or more open data file formats that are optimised for interactive querying patterns, such as Apache ORC and Apache Parquet. Once ingested, queries could be executed directly against the data files using open source data lake query engines such as Apache Drill, Apache Spark, Dremio, and Trino. An experience of this nature would significantly reduce the time to data insights.

Business Intelligence

Data analysts are responsible for surfacing insights through business intelligence (BI) reports and dashboards. Before they can do this, they must first acquire the relevant domain knowledge and transition through different phases of data exploration and data understanding. The less friction that they face, the faster they can deliver the valuable, impactful, and actionable data insights that their organisation needs.

Recent trends have seen an increase in the number of data analysts who can transform, analyse, and report on data using languages like Python, R, and Julia. This elite group of data analysts can use these languages to query data in data lakes or on their local machines. It can be argued that, ideally, most data analysts would prefer to use BI tools like Excel, Power BI Desktop, and Tableau Desktop to analyse data and build interactive reports and dashboards for their organisation. This is primarily because these tools offer simple drag and drop user interfaces that make performing these tasks very efficient.

Advanced Analytics

Data scientists typically ask for data at the most granular level available. Like the data analyst, they must have or acquire the relevant domain knowledge and progress through phases of data exploration and data understanding. Once a sufficient level of data understanding has been achieved, a data scientist will transition to a data interpretation phase where they perform hypothesis testing. Their conclusions inform the choice of the most appropriate machine learning or artificial intelligence models for their business problem.

It is not uncommon to hear that data scientists spend over 80% of their time acquiring and preparing data, and less than 20% of their time performing actual data science work. A more ideal scenario would be for nearly all of this time to be spent on the tasks that deliver the real business value.



Use Our Open Data Services


Open Data Blend Datasets

Data engineers can use the Open Data Blend Dataset API to programmatically ingest our datasets into their analytics platform. They can choose to get the data in CSV, ORC, or Parquet format, or all three, depending on their downstream data consumption requirements. Each of our datasets is accompanied with details of when the data was last updated, what the data types are for each column, and descriptions of each column to provide some additional context.

Data analysts can use the Open Data Blend Dataset UI to download data files in CSV, ORC, or Parquet format onto their local machines and analyse them in programming tools like R Studio, Azure Data Studio, Power BI Desktop, and Tableau Desktop.

Data scientists can use the Open Data Blend Dataset UI to download the data locally, or the Open Data Blend Dataset API to programmatically pull data into their analytics platform. Once the data has been acquired, they could use notebooks like Jupyter to explore the data, test their hypothesis, and train machine learning models. Because our datasets have already been optimised for analytical workloads, the data scientist only needs to perform trivial joins and light data transformations (e.g. feature engineering) to prepare the data for machine learning models.

Experience our open data catalogue first hand.

Open Data Blend Analytics

Data analysts can connect to the Open Data Blend Analytics model from BI tools like Excel, Power BI Desktop, and Tableau Desktop, and dive straight into their data analysis. Because there is no need to download or model any of the data upfront, data analysts can begin creating reports and dashboards as soon as they have a good understanding of the data.

Like data analysts, data scientists can connect to the Open Data Blend Analytics Model from any of the supported BI tools and start their exploratory data analysis (EDA). This means they can obtain the required level of data understanding at an accelerated rate. After the EDA, data scientists can download the data files, from the corresponding Open Data Blend Datasets, and use languages like Python, R, and Julia to build and train their machine learning models.

Learn more about our interactive analytics service.

Follow Us and Stay Up to Date

Follow us on X and LinkedIn to keep up to date with Open Data Blend, open data, and open-source data analytics technology news. Be among the first to know when there's something new.

Blog hero image by UX Indonesia on Unsplash.

Got feedback?
Get involved.
Created and maintained by

Copyright © 2019-2025 Nimble Learn Ltd. All rights reserved unless otherwise stated. Company Registration Number 08637310. VAT Number 174 9728 60.

Open Data Blend®, the Open Data Blend® logo, Nimble Learn®, and the nimblelearn® logo are registered trademarks of Nimble Learn Ltd. All other product names, logos, and brands are the property of their respective owners, and their use does not imply endorsement.

Terms    Privacy    Cookies    SLA    Licensing    Docs    Status