top of page
Search
Writer's pictureHendrik Speelman

A guide to shared datasets in Power BI

Hi there! In this blog post I elaborate more on the Power BI universe and how you can leverage the power of so called shared datasets. I intentionally did not add this topic to my blog post agenda, but when working for a lot of clients where you face the same problem, I thought it would come in handy to share my experience.


First of all, what is a dataset?

"A dataset is a collection of data that you import or connect to. Power BI lets you connect to and import all sorts of datasets and bring all of it together in one place. Datasets can also source data from dataflows. Datasets are associated with workspaces and a single dataset can be part of many workspaces." - Microsoft


In that sense, a dataset is a hub where all your data comes together to create your report. When you finished creating your report, you click 'publish' and your dataset and report are published to the Power BI Service. But what happens when you want to create more reports based on the exact same dataset throughout the organization, without copying the DAX measures all over again?


This is where the term "shared" dataset jumps in. This type of dataset is meant to be shared between multiple reports across the organization. This means people can create several reports on one single dataset.


What to take into account?

There are a few considerations to take into account when wanting to setup the shared dataset approach.

  1. Split dataset and report design to keep things clean

  2. Try to use the import mode as much as possible to evade capacity problems when multiple users runs their reports on your dataset

  3. Try to keep as much DAX logic (especially calculated columns) inside of your dataset

When following these three guidelines, you are all set to begin working with the shared dataset framework.


What is this framework?


The shared dataset framework can be a challenge to get at first, but when knowing the various concepts on who gets to do what, it can be really helpful for your organization that wants to elaborate on the self-service aspect of Power BI. It enables people to create their own reports, while datasets are mostly governed by IT or business champions.


Let's take a look: pay attention to the dotted lines which show the main aspect of working with shared datasets.

The process goes as follows:

  1. Dataset builder creates a dataset (which is a report) inside Power BI Desktop. You can see this as a connection file that will consolidate your sources. You can build additional tables here to group your measures and model your data to create relationships etc. The golden rule here is that this report may not contain any visuals. It only holds the connection to your datasource (import, direct query, dual, ...)

  2. The report is published to the Power BI Service. As you can see, Power BI publishes both the report and dataset as a separate entity.

  3. The data artist or report builder needs to create a report. The data of this report is contained in the dataset. The only thing that this person needs to do is to create a connection to the dataset which was already published in the Power BI Service. This connection is a live connection.

  4. When the data artist is done with the report development, the report is published and it is added in the lineage of the dataset. However, no additional dataset is created and one single version of the truth is kept in the shared dataset.

Pros versus cons of working with a shared dataset

PRO

CON

Keeps dataset and report apart from each other

A report is always published to the Power BI Service, which needs to be deleted afterwards.

Several reports can be created on one single dataset, which reduces maintenance cost on that dataset.

User is not able to create calculated columns, unless he creates a local dataset first

A single version of the truth can be created.

When additional data is needed and added to the dataset, it may impact other reports

Having a split in dataset and report design is a great way to make people grow in either Power Query or DAX

​

Conclusion

It really depends on your organizational needs if you need the shared dataset framework. However each time I work for a client that wants to roll out Power BI as a self-service BI tool, it seems that the shared datasets often win compared to having one dataset per report. One of the pros of working with this framework, is that it retains one single version of the truth, which is immediately a trigger for me to walk this path and providing as much guidance around this topic for my customer.

112 views0 comments

Recent Posts

See All

留言


bottom of page