Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

At work I’m using Tableau as a tool for visualizing data with the possibility for the user to filter the data he’s looking at. So the user doesn’t need to program. He can use just is mouse which is great for people who don’t know exactly how to work with data. All this typical data mangling is hidden below the glamorous surface.

Unfortunatly for me the creation of these dashboards is done also by mouse clicking. When you’re used to program it’s a little bit annoying to click here and there to do something you could do with five lines of code…

An example is changing the datasources of a Tableau workbook. Let’s say we’ve created a workbook with lots of worksheets which are composed to several dashboards.

Now we want to apply all these calculations and visualizations to another set of datasouces with the same structure but with different content.

## The manual way (which is the standard Tableau-way!)

The standard way to replace the datasources is to add the new datasources to the workbook.

Then you have to switch each datasource of a worksheet by selecting one datasource and changing it via Data -> Replace datasource

After changing each datasource manually the blending relationships might be broken. So you have to fix this manually, too. To do so right-click onto the field connecting the datasources and choose Replace References…

Then you can remove the old datasources.

I think that’s really painful and tedious. As an R-user I prefer a more painless way.

## The Document-API

Tableau has published a python-packages called tableaudocumentapi. The documentation can be found here.

Unfortunately the documentation is very short and not complete. To change the datasource (with its connections) you should do the following steps according to the docs:

  1 2 3 4 5 6 7 8 9 10 11 12 13 14  from tableaudocumentapi import Workbook sourceWB = Workbook('WorkbookToUpdate.twb') sourceWB.datasources[0].connections[0].server = "MY-NEW-SERVER" sourceWB.datasources[0].connections[0].dbname = "NEW-DATABASE" sourceWB.datasources[0].connections[0].username = "benl" sourceWB.datasources[0].connections[1].server = "MY-NEW-SERVER" sourceWB.datasources[0].connections[1].dbname = "NEW-DATABASE" sourceWB.datasources[0].connections[1].username = "benl" sourceWB.save() 

The code above only works for remote datasources. If you’re using local csv-files the result is an error.

Even when you’re using a datasource hosted at Tableau online the above code is not sufficent! If you run this code the datascource(s) will be changed. But the name will not be changed.

There’s an attribute sourceWB.datasources[0].caption which should be changed, too.

But even then you can find references to the old datasource in the workbook-file. You also have to call sourceWB.datasources[0].clear_repository_location().

So, I’m guessing (or hoping ?) the following code is sufficient:

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17  from tableaudocumentapi import Workbook sourceWB = Workbook('WorkbookToUpdate.twb') sourceWB.datasources[0].clear_repository_location() sourceWB.datasources[0].caption = "MY-NEW-SOURCE" sourceWB.datasources[0].connections[0].server = "MY-NEW-SERVER" sourceWB.datasources[0].connections[0].dbname = "NEW-DATABASE" sourceWB.datasources[0].connections[0].username = "benl" sourceWB.datasources[0].connections[1].server = "MY-NEW-SERVER" sourceWB.datasources[0].connections[1].dbname = "NEW-DATABASE" sourceWB.datasources[0].connections[1].username = "benl" sourceWB.save() 

## Conclusion

The Document-API is a nice idea for automating deployment of workbooks for several different datasources. Unfortunately the status seems to be “pre-alpha”. The last commit (besides changing copyright notes) is over two years old.

So if you know of really working workflows for changing datasources of workbooks feel free and contact me.