Context: I work with data from non-profit organizations, and so a big concern in many of my analyses is if and how much people are donating from one year to the next. One of the things I normally like to do in my analyses is get a value for each person that represents how much their yearly donations are increasing or decreasing on average for 5 years (a simple slope from the regression of their giving values on the years that they gave). It was pretty simple and quick to do this in R for previous projects, so there was no hassle there. Now that we have Statistica in the office, my supervisor wants me to use it for our current project.
Problem: I was looking for a way in Statistica of doing the above slope calculation for each row in a dataset of roughly 82,000 rows, and could not find it.
Solution: As I mentioned in my last post, it’s possible to feed your Statistica dataset into R using the Statconn Dcom server, so that you can process it/analyze it in R and then output your results back into Statistica. So, I fed my dataset of 82,000 rows and 264 columns into R, and used some code I had used previously to calculate 5 year giving slopes for each row in the data set, and to output a new worksheet with the newly calculated slopes column. Although the code is pretty simple, the entire process seemed to take about 5 minutes, which was unbearably slow!! It’s a pretty important part of my analysis, so going without it isn’t an option.
I sent an email to one of the Statistica support guys, so hopefully they have a way of doing this kind of data processing natively, instead of having to wait all that time for the data to be processed through R.