25 January 2024

Qlik Optimized Load and Concatenation

Share this message
How to maintain optimized load in Qlik Sense when concatenating tables.

At Bitmetric we continuously hone our skills, and we like to help you do the same. That’s why we regularly post a new Qlik certification practice question to our LinkedIn company page. Last Friday we asked the following Qlik Data Architect certification practice question about optimized load and concatenation in Qlik Sense. Do you know the answer?

Qlik Sense Optimized Load  concatenating tables with the CONCATENATE LOAD statement.

Many (confidently) chose answer C, however, the correct answer was actually B.

The correct answer is B

Both tables in figure 2 load optimized, but only the first table in figure 1 loads optimized. See the logs from both load statements below (click for larger image).

Figure 1

Qlik Sense concatenate unoptimized load

Figure 2

Qlik Sense concatenate optimized load

As expected, the relative load time is significantly faster for the fully optimized load, though in absolute terms the difference is neglible.

Maintaining optimized load in Qlik Sense

We covered the basics of optimized load in our very first Friday Qlik Test Prep question on Qlik optimized load. To summarize, a load statement is optimized when you limit your operations to:

  • Renaming fields (using an alias). You can also load the same field twice under a different alias. This can be useful to create a separate key field.
  • Omitting fields by not including them in the LOAD statement
  • Use a single WHERE EXISTS, with a single parameter. So WHERE EXISTS([Country]) is OK, WHERE EXISTS([Country], [ISO Country Code]) is not.
  • JOIN, KEEP or CONCATENATE with another table
  • LOAD DISTINCT will also keep a load optimized. The DISTINCT part will be processed after the LOAD however, so you might still want to think twice before applying it to very large QVDs.

Neither LOAD statement breaks any of these rules, so there must be some additional factors to take into consideration. Let’s see what they are.

When concatenating tables, an additional rule applies: subsequently loaded tables must contain all the columns from the tables that were loaded before. However, previously loaded tables do not need to contain all columns from tables that are loaded later. Visually:

Optimized load maintained

Optimized load broken

So, to maintain an optimized load while concatenating tables, all we have to do is load and concatenate the tables in ascending order of width. Sounds simple, right? Unfortunately, it can get a bit more complex than that. Tables might not always share the same columns. A column that appears in one table, might be missing from another table, and vice versa.

Consider the scenario below. We first load Table 1 with three columns, to which we concatenate Table 2. As the second tables misses some columns from the first table (B and C), the resulting load will be unoptimized.

Concatenating tables in Qlik when the fields aren't the same breaks optimized load

To fix this, we need to add the missing columns to the second table. Theoretically, this ensures the load remains optimized.

Optimized concatenate load in Qlik Sense with blank columns

There’s a practical issue though: if we create the missing fields in the script with a null() function or an empty string, optimized load is broken. To solve this issue, we need to go upstream, to the moment when the QVD files are created.

A common approach is to add null values with the proper field name for each ‘overlapping’ column in the QVD, as shown in the script below.

Add null values to QVD

This solves the issue, but isn’t optimal. It can be hard to predict which QVDs will need to be combined in the future. Different QVDs may be combined in different ways in different apps, which means that we’ll potentially need to create a lot of empty fields, cluttering files and reducing ease of maintenance.

An alternative solution is to add a single null column to your QVDs by default (typically only the fact table QVDs). This column can then be flexibly aliassed for each column that needs to be created. No need to go back to the upstream app creating the QVD.

In the script below, the QVD contains an empty field called [Null] which is loaded as the (non-existing) columns [A] and [B].

Including a default null value in your Qlik Sense QVD to maintain optimized load during concatenation

The result: the optimized load is maintained during concatenation, with only a single additional column in the source QVD:

To wrap it up, by lining up your tables in the right order and adding either dummy columns or a universal null column, you’re not just keeping your load optimized; you’re playing it smart. This isn’t just about making your Qlik Sense work faster; it’s about making it work smarter. And as we’ve seen, with a bit of know-how and a few strategic moves, you can keep your data flowing smoothly, no matter how tricky the concatenation gets. So, go ahead, put these tips to the test in your next Qlik Sense project and watch the magic happen!

Take your Qlik skills to the next level!

Since 2013, the Masters Summit for Qlik is the premier advanced training for Qlik. Join us in Vienna and take your Qlik skills to the next level.

Join the team!

Do you want to work within a highly-skilled, informal team where craftsmanship, ingenuity, knowledge sharing and personal development are valued and encouraged? Check out our job openings.

Friday Qlik Test Prep Performance Script Solution

How can we help?

Barry has over 20 years experience as a Data & Analytics architect, developer, trainer and author. He will gladly help you with any questions you may have.