**Adventures in Statistical Computing**, and kindly contributed to R-bloggers)

So it’s been a week since I started down this path. I worked most of this out over last weekend, went to a conference, had hectic week at work, and then realized I lost my work. Gah.

I’ll be posting my general thoughts on R later. Mostly it seems to be a neat language. Lots of ways to do things. The ability to create output seems limited. I played with a number of things trying to create rich HTML output like I did with SAS. R2HTML might be what I need; I couldn’t get it to work.

So here is what I have

require(fImport)

require(PerformanceAnalytics)

These two packages seem to do a lot of what I need. PerformanceAnalytics has a wealth of charting tools for financial data.

#Function to load stock data into a Time Series object

importSeries = function (symbol,from,to) {#Read data from Yahoo! Finance

input = yahooSeries(symbol,from=from,to=to)#Character Strings for Column Names

adjClose = paste(symbol,”.Adj.Close”,sep=””)

inputReturn = paste(symbol,”.Return”,sep=””)

CReturn = paste(symbol,”.CReturn”,sep=””)#Calculate the Returns and put it on the time series

input.Return = returns(input[,adjClose])

colnames(input.Return)[1] = inputReturn

input = merge(input,input.Return)#Calculate the cumulative return and put it on the time series

input.first = input[,adjClose][1]

input.CReturn = fapply(input[,adjClose],FUN=function(x) log(x) – log(input.first))

colnames(input.CReturn)[1] = CReturn

input = merge(input,input.CReturn)#Deleting things (not sure I need to do this, but I can’t not delete things if

# given a way to…

rm(input.first,input.Return,input.CReturn,adjClose,inputReturn,CReturn)#Return the timeseries

return(input)}

I learned a lot about data handling in R putting this function together.

#Load SPY data

spy = importSeries(“spy”,from=”2010-01-01″,to=”2011-10-22″)

#Load Google data

goog = importSeries(“goog”,from=”2010-01-01″,to=”2011-10-22″)#merge the time series

merged = merge(spy,goog)

Nothing fancy here. The merge() function is nice, but I have no idea how to do anything but the “full” join that it defaults to. If anyone knows of a good tutorial on doing more advanced SQL style joins, please let me know.

#Chart the Cumulative Returns

png(“c:\\temp\\Returns_r.png”)

chart.CumReturns(merged[,c(“spy.Return”,”goog.Return”),drop=FALSE],

main=”Total Returns SPY vs Google”,

legend.loc=”topleft”)

dev.off()#Create the Correlation plot

png(“c:\\temp\\Corr.png”)

chart.Correlation(merged[,c(“spy.Return”,”goog.Return”)],histogram=TRUE,pch=”+”)

dev.off()

First, the chart.CumReturns() produces a nice graph. Better than I was able to do with plot().

Second, the char.Correlation() also gives a neat output. I would really like to find a comparable method to produce the alpha ellipses that I did in SAS.

Third, I cannot find a good method that is comparable to PROC CORR. Can I get a good output with both correlation, covariance, mean, std, etc? Please, let me know.

#Regress Google on SPY

reg = lm(merged[,”goog.Return”]~merged[,”spy.Return”])#Create the confidence interval

newx = merged[,”spy.Return”]

prd = predict(reg,newdata=newx,interval=”confidence”,level=.95, type=”response”)#Print the Regression Summary

summary(reg)

Linear Regression seems pretty easy. It took me a while to decipher the R help to figure out the confidence interval stuff. Again, if there is a way to produce a rich set of output from a regression like SAS and PROC REG, please show me.

Here is the R output:

Call:

lm(formula = merged[, “goog.Return”] ~ merged[, “spy.Return”])Residuals:

Min 1Q Median 3Q Max -0.089348 -0.005702 -0.000083 0.005513 0.116929 Coefficients:

Estimate Std. Error t value Pr(>|t|) (Intercept) -0.0003841 0.0006424 -0.598 0.55 merged[, “spy.Return”] 0.9641218 0.0509346 18.929 <2e-16 *** — Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.0137 on 453 degrees of freedom (1 observation deleted due to missingness)

Multiple R-squared: 0.4416, Adjusted R-squared: 0.4404

F-statistic: 358.3 on 1 and 453 DF, p-value: < 2.2e-16

Matches SAS. It’s not exact, but very close. That’s good.

#Chart the regression

png(“c:\\temp\\Regression.png”)

chart.Regression(merged[,”goog.Return”,drop=FALSE],

merged[,”spy.Return”,drop=FALSE],

fit=c(“linear”),

main=”Google ~ SPY”,

xlab=”SPY Return”,

ylab=”Google Return”)#add the confidence interval

lines(newx$spy.Return,prd[,2],col=”Red”,lty=2)

lines(newx$spy.Return,prd[,3],col=”Red”,lty=2)

dev.off()

Using the chart.Regression() from PerformanceAnalytics. The fit interval looks suspect. Maybe I did something wrong.

**leave a comment**for the author, please follow the link and comment on their blog:

**Adventures in Statistical Computing**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...