# Processing Data from a Statistica Worksheet Using R

August 29, 2012
By

(This article was first published on Data and Analysis with R, at Work, and kindly contributed to R-bloggers)

Context: I work with data from non-profit organizations, and so a big concern in many of my analyses is if and how much people are donating from one year to the next.  One of the  things I normally like to do in my analyses is get a value for each person that represents how much their yearly donations are increasing or decreasing on average for 5 years (a simple slope from the regression of their giving values on the years that they gave).  It was pretty simple and quick to do this in R for previous projects, so there was no hassle there.  Now that we have Statistica in the office, my supervisor wants me to use it for our current project.

Problem: I was looking for a way in Statistica of doing the above slope calculation for each row in a dataset of roughly 82,000 rows, and could not find it.

Solution:  As I mentioned in my last post, it’s possible to feed your Statistica dataset into R using the Statconn Dcom server, so that you can process it/analyze it in R and then output your results back into Statistica.  So, I fed my dataset of 82,000 rows and 264 columns into R, and used some code I had used previously to calculate 5 year giving slopes for each row in the data set, and to output a new worksheet with the newly calculated slopes column.  Although the code is pretty simple, the entire process seemed to take about 5 minutes, which was unbearably slow!!  It’s a pretty important part of my analysis, so going without it isn’t an option.

I sent an email to one of the Statistica support guys, so hopefully they have a way of doing this kind of data processing natively, instead of having to wait all that time for the data to be processed through R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

Tags: , , , , , ,