Going from zero to R-Analytics with your team

[This article was first published on R-Analytics, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Before to continue with the posts about how to do things with R, I have decided to describe how I lead the creation of an analytics team starting from zero.

My only intention here is for this information to be useful for companies looking to create their analytics team.

Well, first the first, the people for your team.

How many people is needed will depend on your company size or on the amount of money the company wants to invest in the analytics team creation but, at least you must have 2 developers, just ensure they are real developers.

Now, it is needed to define the software to use.

Database, for this I suggest to start small, let’s say, using MariaDB, or SQLServer if you want to pay for a license, just ensure you can run analytic functions, some of them call them window functions, they are very useful when doing analytics. Please read this article to know more about window functions: https://dzone.com/articles/mysql-8-vs-mariadb-comparison-of-window-functions

ETL software, unless you want to pay for a license to IBM or Microsoft or other people, my suggestion here is to use a simple language as we did with PowerShell and create your basic libraries like:

– SQLFunctions to read or write data to your database
– Log Functions to write and delete from the log text files.
– E-Mail functions to send simple emails or attachments included.

The point here is to create all the basic functions an ETL software needs.

One more and important thing why I like  PowerShell is because we have Visual Studio Code, a very useful tool to edit and debug PowerShell scripts.

Besides the ETL software, it is important to build the database structure to configure the ETLs, for this I designed something called AMS( Automatic Monitoring System ), it holds the ETls configuration and logs, and with the help of the team a web tool was created to configure each ETL we were creating.

– The dashboard software. I like Tableau but you can get open source software: https://opensource.com/business/16/11/open-source-dashboard-tools-visualizing-data

Documentation, you will need at leas 2 kind of documents:

– A basic Spread Sheet document to create and activities plan, you really do not need anything complex nor to pay for a special software.
– A Word or Libre Office document template to create requirement specifications from the users, this is so important to have this document, it is very useful for everybody to see the same thing. If you do not create a requirement specification, it will cause confusion on what is really needed to develop.

At this point we have all the needed elements for a basic business intelligence team, this is the first part. For this first part, if your developers are good, they will not need any hired training, to put all of the above to work requires very basic knowledge than can be learned by the same developers.

Now, for each project, it is needed to involve a key user from the company operation and an analyst, the person that serves as interface between the operation and your analytics team.

Converting your Business Intelligence team to an Advanced Analytics team.

To learn how to do advanced analysis on your data by using mathematical algorithms requires specific training, not only to understand the basic operation of the algorithm to be used in each scenario but to know how to create models or to do the different kind of analysis required before the modeling, that is why now there are new positions in the analytics team like data engineer, data scientist, etc.

The first thing is to get a good training, we used promidat.com, these are people with good knowledge and they have experience in training, the site is in Spanish but if you require, they can give you the training in English.

It is important to have more than one person taking the training at the same time because the feedback they can receive from each other.

You can use one of the many training sites like datacamp, which is good but nothing is the same like an official training.

It is important that at the end of the training, your team creates real projects, useful for the company, not only to pass the training level, by doing this, you ensure that all the learned things are applied in practical cases in the company.

To fulfill this strategy it took 2 and a half years because nobody taught me how to do it, I had to figure out the strategies to follow and guide the team for the correct way. I think that knowing the strategy, It will take maybe 1.5 years to fulfill the strategy.

Enjoy it!!!.

Carlos Kassab

To leave a comment for the author, please follow the link and comment on their blog: R-Analytics.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)