29 March 2024 Dealing with missing and late arriving values with OtherSymbol and other solutions Share this message At Bitmetric we continuously hone our skills, and we like to help you do the same. That’s why we regularly post a new Qlik certification practice question to our LinkedIn company page. Some time ago we asked the following Qlik Data Architect certification practice question about dealing with missing or late arriving values in your source data. This was somewhat of a trick question, as there are multiple (somewhat) correct answers. The correct answers are A and C, and in some cases even D Of course, the right solution often depends upon the context of the problem. Let’s take a look at all three of these possible solutions and see how they work, and in which cases they might be useful. But first, what problem are we solving? Facts without a corresponding dimension, what’s the problem? Consider the (simplified) data model below. When looking at the Salesperson and Sales tables, we see there are sales for SalespersonId 4 and 5 that don’t have a corresponding salesperson. After we load this data into Qlik, the issue becomes clear. When no data is selected, sales made by these missing SalespersonId’s is included in the total. These sales are excluded when all salespersons are selected. This can be confusing for users and lead to incorrect conclusions being drawn from the data. It’s advisable to do something about it. ℹ Intermezzo: what causes missing dimension values?There can be multiple causes for missing dimension values. A very common one being timing discrepancies known as early-arriving facts (or late-arriving dimensions), the data for the dimension table is simply lagging behind and will be loaded later. Another cause can be that sometimes the dimension value is just not available. For example, in a retail setting, some customers might pay with cash and the retailer has no further information about them. Now that we know what the issue is, let’s look at 3 ways we can resolve it. Remove non-corresponding values from the fact table The first possible solution is the one least likely to be correct. Instead of dealing with the missing dimension values, we simpy remove the facts that don’t have a corresponding dimension value: We can accomplish this using statements such as KEEP, or WHERE EXISTS. For example with the following script: Salesperson:RIGHT KEEP (Sales)LOAD * INLINE [ SalespersonId, Salesperson 1, Winnie the Pooh 2, Tigger 3, Piglet ]; Problem solved, right? Well… only if we can be absolutely sure that these facts aren’t needed. For example because it’s test data, or when the business rule states that only sales that can be attributed to a salesperson are considered actual sales. At Bitmetric, we don’t encounter this scenario frequently. Typically, it’s better to make it clear to the user that certain dimension values are missing, and to make them selectable for further analysis. We’ll look at two different ways to accomplish this next. Add missing values with WHERE NOT EXISTS The first approach is to add the missing values to the dimension table in the script. The steps to do this are: Copy all dimension keys in the dimension table to a temporary field, in a temporary table; Concatenate the dimension keys in the fact table to the dimension table, but only those keys that do not exist in the temporary field we created in the previous step. For this we use WHERE NOT Exists(). Drop the temporary table and field. In our example, the script looks like: Now the missing dimensions values are visible to the user, and selecting all values yields the same result as selecting no values: This solution works well, but there’s another solution that’s even lazier simpler. Let’s take a look at that next. The OtherSymbol value handling variable The second approach is to use the OtherSymbol value handling variable. This approach seems to be less well known than the previous approach, but is a lot easier to implement. The steps are: Set the OtherSymbol variable to a ‘dummy’ value that will be used as the key value for the missing dimension value Add the unknown value as the last record of the dimension table In our example, this results in the following script: Of course, in a typical scenario we would CONCATENATE the record to the existing dimension table. The result is shown below. All unknown salespersons are now mapped to the Unknown salesperson and are now selectable by the user. This works, but how? The OtherSymbol method works by adding values that exist in the same field in previously loaded tables, but haven’t been found yet in the table that is currently being loaded. this means that there are two things we need to keep in mind: The fact table must be loaded before the dimension table The unknown/OtherSymbol record need to be the last record of the dimension table Another thing to keep in mind is that OtherSymbol creates a row for each unknown value. Because only a single salesperson Unknown is show we might assume all sales are linked to a single record. In reality, the same record is created for each value: WHERE NOT Exists() or OtherSymbol You may wonder what the best method is better. As mentioned at the start of this post, it all depends on the context of the problem. To get the first question out of the way, performance between both methods is roughly equal. Besides performance, the pro’s and cons of both methods are: WHERE NOT Exists() ✅ Pro Well known and understood pattern More flexible when scripting, for example including the ID in the label ❌ Con More script OtherSymbol ✅ Pro Simple Less script ❌ Con Very specific load pattern, which leaves room for error Less flexible If pressed, at Bitmetric we’ll choose the WHERE NOT Exists() method over the OtherSymbol method. It’s good to be aware of both methods though, hopefully now you are too 😉 That’s it for this week, hope to see you again on a future blog post! More from the Bitmetric team Qlik Cloud Backup Protect your investment in Qlik with daily incremental backups stored in an encrypted environment with redundant storage. Available for as little as 2 Euro per day. Learn more. Masters Summit for Qlik The Masters Summit for Qlik provides the next step in your path to becoming a Qlik specialist. As an extra perk, Bitmetric has secured an over $450 discount just for you! Use code BITMETRIC at checkout. Read more here. Data Model Friday Qlik Test Prep Qlik Solution How can we help? Barry has over 20 years experience as a Data & Analytics architect, developer, trainer and author. He will gladly help you with any questions you may have. Call us Mail us 12 May 2025 Dagelijkse Qlik Ondersteuning: altijd hulp bij data analyse Bij Bitmetric snappen we hoe belangrijk het is dat je Qlik omgeving gewoon goed werkt. Of je nu dagelijks data analyseert of snel wat inzichten nodig hebt, je wilt dat alles soepel verloopt. Daarom staan onze experts elke dag voor je klaar. Van kleine vragen tot ingewikkelde uitdagingen, we helpen je graag verder, zodat jij je kunt focussen op wat echt telt! Qlik 25 April 2025 Game-Changer in Qlik: Set Analysis Now Works WITHOUT Using It’s Syntax! Discover Qlik Cloud’s latest feature that lets you apply object level filters without writing any set analysis syntax. A simpler and faster way to build dashboards, especially for non-technical users. Read more in this blog post. New Release Qlik 23 April 2025 When Everyone Has Different Numbers: Why Data Alignment Matters Different teams, different data, different results. This post explores how misaligned data leads to confusion, and how TimeXtender helps bring everyone back to the same page. TimeXtender
12 May 2025 Dagelijkse Qlik Ondersteuning: altijd hulp bij data analyse Bij Bitmetric snappen we hoe belangrijk het is dat je Qlik omgeving gewoon goed werkt. Of je nu dagelijks data analyseert of snel wat inzichten nodig hebt, je wilt dat alles soepel verloopt. Daarom staan onze experts elke dag voor je klaar. Van kleine vragen tot ingewikkelde uitdagingen, we helpen je graag verder, zodat jij je kunt focussen op wat echt telt! Qlik
25 April 2025 Game-Changer in Qlik: Set Analysis Now Works WITHOUT Using It’s Syntax! Discover Qlik Cloud’s latest feature that lets you apply object level filters without writing any set analysis syntax. A simpler and faster way to build dashboards, especially for non-technical users. Read more in this blog post. New Release Qlik
23 April 2025 When Everyone Has Different Numbers: Why Data Alignment Matters Different teams, different data, different results. This post explores how misaligned data leads to confusion, and how TimeXtender helps bring everyone back to the same page. TimeXtender