Every Friday at Bitmetric we’re posting a new Qlik certification practice question to our LinkedIn company page. Last Friday we asked the following Qlik Data Architect certification practice question about subset ratios in Qlik Sense:

### The correct answer is C: 28% of the CustomerID’s have never placed an order

To validate and check the quality of the data model which you have just created, the data model viewer is an important tool. One of the things you can check here is the **subset ratio**. The **subset ratio** is the percentage of present distinct values within that table compared to the total distinct values of the chosen field in the whole data model . To demonstrate this see the following image:

In the image above we have selected the key field **CustomerID **in the **Sales** **table**. We can see in the bottom half of the screen that the field **CustomerID **has a total of 100 distinct values in all tables in the model, not just in the selected table, visible as *Total distinct values. *By selecting this field within the **Sales table** however, we see that there are 72 distinct values in this table alone, visible as the *Present distinct values*. So of all 100 distinct **CustomerID’s** within all tables we now know that there are 72 distinct **CustomerID’s** in the **Sales table**. So by diving these we will receive the** subset ratio**:

Present distinct values / total distinct values = 72 / 100 = 0,72 or 72%

Now knowing that there is a total of 100 values this means that the** Customer table** must be filled with 100 distinct **CustomerID’s**. Having a look in the data model viewer confirms this:

There are 100 distinct values present in the **Customer table**. Now by subtracting the 72% of the **Sales table** of the total 100% we end up with 28% of CustomerID’s in the total model (in this case all present in the** Customer table**) which have never placed an order.

### Other things to keep in mind

Some other things to keep in mind about the subset ratio:

- What if the subset ratio of the dimension table is also lower then 100%?

If this amount would have been lower there would have been a discrepancy between the **Customer table** and the **Sales table** in which both tables would have had values not present in the other. For a fact table it is not uncommon to have a subset ratio of lower then 100%, however a dimension table, like the C**ustomer table** in the example, with a subset ratio of less then 100% means that you should have a look at the data in the model. If for example the subset ratio in the Customer table would have been 90%, it means that we have 10% of distinct **CustomerID’s** present in the Sales table, which are not being matched with a **CustomerID** in the Customer table.

- What if the total of the subset ratio’s is 100%?

If the combined total of the subset ratio’s of all tables would be 100% it means that there are no matching values between the tables. Good luck 😉

That’s it for this week. See you next Friday?

Want more? Then click the button below for our full archive!