26 October 2022

Choosing a data model for Qlik Sense

Share this message
Qlik data modelling - Bitmetric Friday Qlik Test Prep question

Every Friday at Bitmetric we’re posting a new Qlik certification practice question to our LinkedIn company page. Last Friday we asked the following Qlik Data Architect certification practice question about Qlik Sense data modelling:

Qlik data models question of the week

We got a lot of responses this time, showing a lot of good answers but also some discussions about the data models. And reading these discussions we do agree: The official answer should be: “it depends”. Which model is most efficient will always depend on the data provided or requirements of the project. However in case you find yourself sweating over a Qlik certification exam, you should always think “what does Qlik want to hear?“. And in that case:

The correct answer is C: A star schema

Since we have phrased the question as “generally speaking” we will not go into depth too much about data modelling, since that is a rabbit hole in itself. However to understand the answer, we need to understand that we are looking at dimensional modelling here and will explain why the star schema is most of the times the preferred model for Qlik.

Entity-Relationship Modelling

Back in the day when relational databases first started to appear Relational Database Management Systems made it easy to store, retrieve and modify the data from these databases. The type of data modelling used was the so called Entity-Relationship Modeling. This aims to Normalize all data. This is the process of removing all redundant data from the database and only storing useful, related data in separate tables. This process makes it very efficient in inserting and updating information in the database. On the downside however, we find that retrieving information from the ER-model databases goes from being less convenient to very difficult. For example a car-lease company could store car related dimensions completely normalized into separate manufacturer, model and fueltype tables. We can imagine that the bigger the dataset the bigger the query grows to retrieve information. Only to get all car information from a single car we need to go through three tables in this case.

Dimensional Modelling

In case Entity-Relationship Modelling becomes too complicated to query, dimensional modelling can offer a solution. A dimensional model consists of a single fact table. The fact table contains a compound primary key, with several other separate keys linking to dimension tables. Next to these keys all important metrics are stored in the fact table.

The dimension tables contain all data that gives context to the metrics stored in the fact table. These dimensions contain data that is often centered around certain levels which are all flattened (denormalized) to give context about a certain topic in the dataset. For example, if we take the database of a car-lease company, the car dimension could contain manufacturer, model and fueltype all in the same dimensional table. This makes that querying information on a single car only needs to go through one table instead of many different tables.

Different models

As we can see from the question asked that there are different types of dimensional models, which we will walk through in short fashion and see how these relate to Qlik. The biggest reasons to use dimensional modelling instead of ER-modelling are ease of use and ease of understanding. It is easier to understand for developers and users how the data is stored and retrieved and which types of data correlate with each other. We identify the following data models in dimensional modelling:

A snowflake schema

A snowflake schema is a partially denormalized schema. Visible in the fact that certain dimension tables are also linked to another table containing further information. With some fantasy the model looks like a snowflake under a microscope.

A star schema

The star schema consists of a single fact table, containing all the measures and the keys to the dimension tables. The biggest advantage being ease of use and understanding for the developers and users. The model is named star schema, since if you really squint your eyes it sort of resembles a star.

The biggest benefit of using dimensional modeling in Qlik applications is the increased response time. Qlik applications work faster when there are fewer links between the tables. For each link between the tables Qlik has to calculate all possibilities of the relationships, making more links increase this calculation time exponentially.

A single table

With the benefit of increased response time in mind, why wouldn’t a single table be the best solution? No links means no calculation time, right? As we stated in the beginning, the answer always is: “it depends”, the following diagram shows why a start schema is generally speaking considered the preferred model, since it offers the best balance between all various trade-offs.

That’s it for this week. And remember:

Are these questions too easy for you?

If these questions are too easy for you, then you may be ready for the Masters Summit for Qlik.

Masters Summit for Qlik New Orleans 2022

At the Masters Summit for Qlik, we take your Qlik skills to the next level in 3 days with deep dives about:

  • Qlik Data Modelling
  • Qlik Scripting
  • Set Analysis and Advanced Aggregations
  • Solution Architecture & Performance
  • Effective Visualizations
  • Performance Tuning
  • Qlik Sense integration

The courses are taught by Qlik veterans Rob Wunderlich, Oleg Troyanksy, Bill Lay, Nick Webster and Bitmetric’s Barry Harmsen. Together, this team brings almost 75 years of Qlik experience and has implemented hundreds of successful projects.

You can find more information and registration on the Masters Summit for Qlik website, or check out a photo impression of the previous edition in Madrid.

See you next week!

Friday Qlik Test Prep Solution

How can we help?

Feel free to contact us if you have any comments or questions.