Saturday, March 14, 2009

Browse-for-File Macro Button

Several times I’ve created QlikView reports that must read data from a file but the name of the file can vary and must be supplied by the report user. Often this is a file that will be used by the loadscript so the user first clicks a macro button that runs a browse-for-file dialog and then the user clicks the standard Reload button.

Here’s an example of a macro that borrows the browse file dialog method from the Microsoft Excel object model. This macro should work on any PC that has MS Excel installed (it does not actually start Excel). You can copy the code from here and paste it into your QlikView report module code:

Sub Browse_for_File
' Get the file pathname
Set oXL=CreateObject("Excel.Application")
f_name=oXL.GetOpenFilename("All Files (*.*),*.*",,"Select file",False)
If f_name="False" then
  'user cancelled out of dialog box
  Set oXL=nothing
  Exit sub
End If
'store file pathname in the file_pathname variable
Set oXL=nothing
End Sub

This macro stores the file pathname for the file selected by the report user into a document variable named file_pathname (create the new document variable before running the macro). When I create a report that uses this macro I usually also put an input box in the report that shows the file pathname stored in the variable and allows the user an alternative method for specifying the pathname.

A QlikView document example of this technique, named BIG_Text_File.qvw, is available here – Sorry about the ads and popups – just click on the specific file download.

Monday, March 9, 2009

Finding Missing Dates With a Chart

Yesterday I was reviewing a report of sales and shipment data for 2007 and 2008. The data was very detailed with order numbers and quantities and shipdates. I wanted to know if any shipdates were missing (dates for which no shipping occurred) or if there were really shipments for every day in 2007 and 2008. This is how I looked for any missing dates:

I created a straight table chart with SHIPDATE as the dimension. I created two expressions. One expression, labeled Missing Date, looked like this:

If(Not IsNull(Above(SHIPDATE)),If(SHIPDATE-Above(SHIPDATE)<>1,Date(Above(SHIPDATE)+1)))

And the other expression, labeled Missing Date Count, looked like this:


The Chart Above() function lets the expression look at a value from the row above. Since the chart is sorted by SHIPDATE, the SHIPDATE value in the current row should be one day greater than the row above. If it isn’t then there is at least one missing date. The check on IsNull is required because there is no row above for the first row on the chart.

The chart worked ok. I tried it out on a sample of the data where I had intentionally excluded two dates from loading in the loadscript and it worked fine.
The total line for the Missing Date Count tells me if there are any missing dates. With this method, though, you have to scroll through down through the chart to find the specific missing dates. You can’t sort the chart to bring missing dates to the top because that would interfere with the Above() function in the expression (I disabled the interactive sort capability for this chart to make sure nobody tried changing the sorted order). The other drawback is that a gap of several consecutive missing dates only shows the one missing date that begins the gap (the Missing Date Count though shows a count of how many dates are missing in that gap).

This expression works in a chart with more dimensions too. For example, the dimensions might be ship-to-country and shipdate. Then the chart would show missing shipping dates (or dates where no shipping occurred) by ship-to-country. It only makes sense though to look for missing shipping dates within data where there is an assumption about shipping every day or every week day in order that your assumptions are tested versus the data.

I know that there are methods that could be used in the loadscript to look for missing dates and there are sql queries that can look for missing members of a sequence in the database. But, if anyone knows of a better method for finding missing dates using only QlikView sheet objects I’d be interested to know how.

Sunday, March 8, 2009

Using QlikView to Look at a Big Text File

This week I was loading data from a very large text file into a database and had some problems with some of the records. I needed to look at the group of records in the file that was causing the problem but the file was big, over 12 million records, too big for Notepad or Wordpad. I decided to load it into QlikView – not the most elegant use for QlikView but setting it up was fast and it worked out better than I hoped.

I put these lines into the loadscript:

Load RecNo() as recordno,
Len(@1) as recordlen,
@1 as RECORD
FROM E:\Data\bigtextfile.dat (ansi, txt, delimiter is '`', no labels)
where RecNo()<1000000;>

That RecNo() function in the Where clause limited the load to the first million records. I tried loading all 12 million rows but on my laptop with 2GB of memory the loaded report was pretty sluggish. I defined the backtick character as a delimiter even though it doesn’t appear in the data so I could load the entire record by calling it field @1.

The main tab on the report itself was a table box for all three fields and list boxes for each of the three fields. Selecting by recordno let me look at specific sections of the file. The list box for RECORD was very useful – all I had to do was type a few characters of the data I was looking for and hit the Enter key to select it and QlikView quickly showed me what I needed from the file.

It worked so well that I added a browse-for-the-file macro and changed the load statement to use variables that are defined with input boxes on the report to make it a useful, general purpose report for whenever I need to examine a text file. An example of this document, named BIG_Text_File.qvw, is available here – Sorry about the ads and popups – just click on the specific file download.