Thursday, December 24, 2015

Chart Adjusts to Whatever Fieldnames are Loaded

I’m often called on to validate a new data file from a client or to examine a new data extract. My first step is to load data into a copy of Steve Dark’s Data Profiler (search on quickintelligence data profiler). I’ve made a few modifications to Steve’s original document – here are a couple of the most useful.

I added a line to the loadscript that uses preceding load to add a record number to the rows being loaded. My line is added between the table label and the first line of the regular Load statement. It looks like this:

DATAFILE:  //table label
Load *, recno() as zrecno;  //this is the preceding load line
Load   //this is the beginning of your "regular" Load statement

For the second set of modifications, I added a new sheet or tab which I named Table.
On the new tab, I added a multibox and then I added, not fieldnames, but 20 expressions. You can add any number you like but 20 seemed like plenty. These are mostly copy and paste so they are not a lot of typing and you can start by copying the expression shown in this blog and pasting it into your document. The first multibox expression looks like this:

[$(=only({1<$FieldNo={1}>} $Field))]

That expression evaluates to the fieldname corresponding to field #1. Then I add a second expression to the multibox:

[$(=only({1<$FieldNo={2}>} $Field))]

You can see that in the second expression, I changed $FieldNo={1} to $FieldNo={2} so that the second expression evaluates to the fieldname corresponding to field #2. I did that for each of the 20 expressions so that the 20th expression in the multibox contains $FieldNo={20}

Then, I added a straight table chart. I used zrecno as the dimension. Then, I added 20 expressions. For the first expression, I typed this:

[$(=only({1<$FieldNo={1}>} $Field))]

And for the label, I typed this:

$(=only({1<$FieldNo={1}>} [$Field]))

Again, the rest of the expressions are similar, each time changing the $FieldNo={1} to be 2 for the second expression and its label, $FieldNo={3} for the third expression and its label and so on for each of the 20 expressions. Click OK when you’re done.

Now, you have a multibox and a straight table chart that appear as though you typed in the 20 fieldnames. But, these objects will adjust themselves to whatever fieldnames are loaded. If you load a file with less than 20 fieldnames, the “extra” ones will just be null.

You can copy and paste these objects into other QlikView documents and they will adjust themselves to whatever the first 20 fieldnames are in the document (see important info in first blog comment below). But, I think these are most useful in a document where different fieldnames can be loaded each time.

I checked Dropdown Select for all columns in the Presentation tab to make it easier for the document user to make selections on the chart. 

In order that we don’t burn up all of the memory on the computer, I added a calculation condition to the straight table chart with this expression =if(count(zrecno) < 200000,1,0) so that the chart will refuse to appear if there are more than 200,000 rows. I customized the error message for the calculation condition to say, “There’s more than 200k rows. You have to make some selections to reduce the amount of data in this chart.” A calculation condition like this is a good idea whenever you have an object that might try to show more data than is possible with the available memory.


Tuesday, December 1, 2015

Easy to Add Multivariate Analysis Feature

Here is a technique that I’ve used several times over the past year to quickly add a multivariate analysis feature to a document. We have numerous documents that have a multibox for selections and usually a graph and at least one chart; usually a straight table with a cycle group for the dimension. I usually start with one of these existing documents.
This feature uses QlikView alternate states. I normally steer people away from using alternate states in documents used by business users as it can be confusing. But, in this technique, it seems to be easily used and understood.

The first step, is to add an alternate state to the document. Using the menu, choose Settings->Document Properties, then choose the General tab and click the Alternate States… button (if you don’t see the button then you may be using an older version that doesn’t have the Alternate States functionality). On the Alternate States dialog box that appears, click Add and then simply type a  B  and then click OK. This defines an alternate state in the document named B. It’s not a very descriptive name but it is perfect for this technique.

Now, we’re going to duplicate the multibox. Right-click on the multibox and choose Copy to Clipboard -> Object. Then, from the menu choose Edit->Paste. QlikView will paste the new copy right on top of the old one so you need to drag the copy of the multibox off to the right. This is easiest if the multibox properties Show Caption option is checked so you can click on the title bar at the top of the multibox and drag it. Make the caption of the original multibox say Selections:   and make the caption on the new copy of the multibox say  B Selections:   It may help to make the caption font on the new copy in italics.

Easy, so far? Now, right-click on the new copy multibox and choose Properties. Choose the General tab and look for the Alternate State field at the top. Click the little down arrow on the Alternate State box and click to choose B and then click OK. You now have two multiboxes: one that controls the usual selections and one that controls an entirely different set of selections in alternate state B.

Here's the part that makes it useful. In your charts, you probably have a column something like this:   Sum( Sales )  with the label Sales on it. Add a new column, label it Sales B, and put it right next to the original column. For the expression, type  Sum({B} Sales) 
The syntax is the same as for set analysis. But, this is simpler, just use the {B} inside the argument for any aggregation function and it makes the aggregation use the selections that are defined in your B Selections multibox. No need to change the chart dimensions or change anything in the load script.

Now, you can compare the column for Sales with the Sales B column with the numbers side by side. I assume the fields most useful for the data were already defined in the multibox. So, you can change the B selections to easily compare, for example, June sales next to March sales; or compare tier 1 customers versus tier 2 customers; or profit from the 1kg size versus the 500g size, or inventory turns at the Memphis warehouse versus turns at the combined California warehouses or total cost of the cheesy-beans business versus costs with administrative costs unselected,… etc.

As quickly as you can add a new field to the two multiboxes, you can have a side by side comparison of data for different selections on that field.

I’ve found that graphs that use this technique should use a similar but noticeably different appearance for the two expressions. Using the example from above, a graph of Sales could use a red line in the line graph and use a dotted red line for Sales B.

You can use color in the chart to point out the biggest differences between the regular aggregation and the B aggregation. I’ve sometimes used the Colormix wizard to color the rows with the biggest differences in bright yellow and the smaller differences with less bright shades of yellow (the Colormix wizard is described in the manual and it is well worth your time to become familiar with it).
It would be just as easy to also add C Selections but, so far, I haven’t found a case where it adds value.