IntroductionThe following blog post is based on a classical reporting task most of the people working in BI got frequently: Besides all the nice dashboards you create in Tableau, from time to time people will approach you with a request for a “data extract” – reports that typically looking like cross tables. Ideally, the extract comes as an Excel file with each crosstab in a separate sheet ready to be analyzed with the favorite tool of the requester. How will you do that in a programmatic way with Tableau? Extracting data out of the Tableau “ecosystem” to process them in another tool is clearly not part of the philosophy of the Tableau guys. They would argue that you should give those people access to Tableau so that they can do their analysis within their tools. Although this is a good point they miss some important things – first Tableau is expensive compared to solutions that come for free or solutions that are already part of the company’s software portfolio (for example Excel with free add-ons like Power Query, Power View & Power Pivot). Given that often there is a no extra or only a small fixed budget, not everybody will get a full copy of Tableau. Second, changing the standard tool for data analysis and visualization (in a broad sense) within an organization needs a lot of budget, time and power. Procedures and templates needs to be changed, people need to be convinced, trained and accompanied. Even with an impressive and sometimes magical tool like Tableau it is not for granted that this project will be a success down to the smallest corner of your company. And that’s why sometimes we need to use Tableau for purposes that are not in its focus – and extracting data out of a workbook into an excel file seems to be such a thing.
Problem statementImagine you have a workbook containing of different sheets all in the form of cross tables. You can see a simple example in the screenshot below (and find a link for downloading the workbook at the end of the posting): If you want to extract all those crosstabs into one Excel file, you have to download them manually (as “Crosstab”) and the move them into one Excel file one-by-one. This is time-consuming & error-prone (e.g. typing mistakes when naming sheets, order of sheets in excel, default size of cells may result in unreadable content) especially if you have to do it multiple times and/or for several reports. Sadly ‘tabcmd’ is not a help here as it exports csv in key-value format instead of maintaining the crosstab layout (see this tableau forum post for what is going on – and this one for voting up an idea convincing Tableau to integrate the crosstab export into tabcmd). That said, I will show a way using free third-party tools to export all three views into on Excel (plus: basic formatting of values is preserved).
Solution OverviewI got the basic idea for how to extract crosstabs from a forum posting given by Christopher Cannata “Hello everyone, I was tasked with a project to allow these crosstab CSV files to be generated automatically each morning for our company. Unfortunately tabcmd was not capable of allowing us to automate the process but I was able to develop a test suite in Selenium IDE call it from a batch script on a daily basis.”. Using a software testing framework for web applications like Selenium to automate the csv extraction process is a hell of a good idea for the first part of our question. And it turned out that PowerShell is a good approach for the second part. PowerShell is simple and yet powerful and avoids a lot of problems with encoding, formats and data types. I ran into these problems when I tried reading the raw csv and exporting it as excel using several custom libraries for Python. But as I use Python for most tasks regarding scripting and automation around Tableau, I took it again in this example to orchestrate the whole workflow. Before we go deeper into the different steps, the picture below shows the whole story:
Extracting Crosstabs from Tableau Server using SeleniumSelenium is a software testing framework for web applications. As such, it can record user actions and afterwards “replay” those activities again and again on new versions of the app, testing if those action still lead to the expected result. We just need the capabilities to record and automatically run those macros (called ‘test cases’ – with a set of test cases called ‘test suite’). Here, a test case mimics user actions to download a single csv file as crosstab with all the necessary steps like – login into Tableau Server, select and view the right workbook and downloading the csv file. The installation is easy and requires two tools – the Selenium IDE for recording test cases (installed as a plugin for Firefox) and the Selenium server for running test suites. The later one is just a jar file, which is started from command line. Now start your Firefox browser and you should see a new icon for starting the developer GUI. The small red icon on the right side shows that Selenium is now recording every step you are doing in Firefox. Type in your Tableau Server URL, login into Tableau Server, open your report and download the crosstab as csv. As you interact with Tableau Server you will see that every step show up in the central window of your Selenium IDE. After you finished downloading the first csv, log out and stop the recoding process. Then save you test case. You can now test if everything works fine by starting the automatic “replay”. If not, change commands or add new ones. It is difficult to give a general template about how to download a crosstab from Tableau Server because the whole workflow and order of commands depends heavily on your profile details like start page, page layout, language and so on. Nevertheless I attached an example test case at the end of the post that should give some guidance on how to fine-tune a test case and provide some general tips:
- It helps, if you change the configurations of your Firefox profile or use dedicated one for this task – see here for a list of things that need to be adjusted.
- Set a dedicated download folder in your Firefox profile – I found no way to tell Selenium how to go through the download manager dialog.
- Selenium will not recognize that you sometimes wait for Tableau to load a workbook. You have to insert separate “pause” commands later within your test case to give Tableau time for processing before Selenium will take the next step. The length of the pause needs to be set by you depending on how long Tableau needs for loading plus some buffer.
- Create a separate test case for every view in Tableau that should be downloaded – that means your last step within a test case should be to sign out from Tableau Server.
- Test cases and test suites are stored as plain html files and can therefore easily edited using a text editor.
- -htmlSuite “*firefox”
- -t trustAllSSLCertificates (to avoid certificate checking errors)