24 August 2022

How to clean data in Qlik?

Share this message

Every Friday at Bitmetric we’re posting a new Qlik certification practice question to our LinkedIn company page. Last Friday we asked the following Qlik Data Architect certification practice question:

Qlik data architect prep question

The correct answer is B: Map … Using …

Of course the obvious discrepancy in the data being loaded in this example was clear to everyone. There are not only three different ways in which the United States is being loaded, but Holland and The Netherlands are also (really) one and the same country. According to the question the solution should be found in the load script and we are looking for the best solution.

The best solution in this case being the answer the Qlik certification exam wants to hear: map using. However since this is by no means a bad solution and could very well be applicable to certain applications, ApplyMap() is also commonly used for ease of use and understanding in the script. So in perspective of maintainability of the script it might even be the better solution.

In the example we have several fields over multiple tables which have the same data quality issues. Therefor the best fix would be to use map using to cover all fields with a single function. Let’s dive in and see how this works.

First of we will need a mapping table. In the example below we have created an inline table, but in most real life scenario’s an attached or stored excel file with the mappings would be the solution to go for. A mapping table is a table consisting of two fields. The first field being the value we are looking for and the second being the new value we would like to have it replaced with. The load or select statement is being prefixed with mapping to tell Qlik this is a mapping table. A mapping table is automatically dropped at the end of the load.

In the mapping table being loaded in this example, we change all values to either ‘The Netherlands’ or ‘ The United States’.

Next up is to tell Qlik which fields we would like to change the values of and with the use of which mapping. This is done by map using. Please notice, that it is important to use this function before the tables are loaded. Qlik will evaluate map using after loading the table, but before storing. So before loading the different dimension tables in the example we add the following line to the script:

This then results in Qlik evaluating the values of these fields before storing the table and applying when necessary, resulting in a nice and cleansed data model:

Some other things to notice:

Unmap

The map function will keep running until the end of the script. This is stopped by using the unmap statement. But notice that the unmap syntax is unmap (fieldlist). So you have to specifically name the fields you want to stop being mapped, not the mapping table being used. You can also use a wildcard (*) to stop all fields from being mapped.

ApplyMap

Map using will only evaluate and change the value after loading the table and while it is being stored. If you want to make changes on an expression level it is better to use ApplyMap(), since with this function you have the option to give a default value if nothing is found. The syntax is as following:

ApplyMap('map_name', expression [ , default_mapping ] )

If you leave the default mapping empty, the results are the same as with map using. However, pay attention that if we use a default mapping with the mapping table we made in this example, the values for Belgium, Germany and France will be replaced by the default value. Let’s take the following expression for example:

ApplyMap('MAP_Country', Country, 'No value') AS Country

In map using Qlik will only change the values from the mapping table, but leave all others in place, since the whole table has already been loaded. However, ApplyMap() is already evaluating during the load of each row and it will replace values it cannot find by the default option. Looking back at the mapping table, we don’t have values for Belgium, Germany and France, so these will be loaded as ‘No value’.

That’s it for this week. And remember:

Are these questions too easy for you?

Masters Summit for Qlik 2022 - New Orleans and Madrid

If these questions are too easy for you, then you may be ready for the Masters Summit for Qlik.

At the Masters Summit for Qlik, we take your Qlik skills to the next level in 3 days with deep dives about:

  • Qlik Data Modelling
  • Qlik Scripting
  • Set Analysis and Advanced Aggregations
  • Solution Architecture & Performance
  • Effective Visualizations
  • Performance Tuning
  • Qlik Sense integration

The courses are taught by Qlik veterans Rob Wunderlich, Oleg Troyanksy, Bill Lay, Nick Webster and Bitmetric’s Barry Harmsen. Together, this team brings almost 75 years of Qlik experience and has implemented hundreds of successful projects.

You can find more information and registration on the Masters Summit for Qlik website.

See you next week!

How can we help?

Feel free to contact us if you have any comments or questions.