# Learning R: Project 1, Part 2

October 30, 2011
By

So it's been a week since I started down this path.  I worked most of this out over last weekend, went to a conference, had hectic week at work, and then realized I lost my work.  Gah.

I'll be posting my general thoughts on R later.  Mostly it seems to be a neat language.  Lots of ways to do things. The ability to create output seems limited.  I played with a number of things trying to create rich HTML output like I did with SAS.  R2HTML might be what I need; I couldn't get it to work.

So here is what I have

require(fImport)
require(PerformanceAnalytics)

These two packages seem to do a lot of what I need. PerformanceAnalytics has a wealth of charting tools for financial data.

#Function to load stock data into a Time Series object
importSeries = function (symbol,from,to) {

input = yahooSeries(symbol,from=from,to=to)

#Character Strings for Column Names
inputReturn = paste(symbol,".Return",sep="")
CReturn = paste(symbol,".CReturn",sep="")

#Calculate the Returns and put it on the time series
colnames(input.Return)[1] = inputReturn
input = merge(input,input.Return)

#Calculate the cumulative return and put it on the time series
input.CReturn = fapply(input[,adjClose],FUN=function(x) log(x) - log(input.first))
colnames(input.CReturn)[1] = CReturn
input = merge(input,input.CReturn)

#Deleting things (not sure I need to do this, but I can't not delete things if
# given a way to...

#Return the timeseries
return(input)

}
I learned a lot about data handling in R putting this function together.

spy = importSeries("spy",from="2010-01-01",to="2011-10-22")
goog = importSeries("goog",from="2010-01-01",to="2011-10-22")

#merge the time series
merged = merge(spy,goog)
Nothing fancy here.  The merge() function is nice, but I have no idea how to do anything but the "full" join that it defaults to.  If anyone knows of a good tutorial on doing more advanced SQL style joins, please let me know.

#Chart the Cumulative Returns
png("c:\\temp\\Returns_r.png")
chart.CumReturns(merged[,c("spy.Return","goog.Return"),drop=FALSE],
legend.loc="topleft")
dev.off()

#Create the Correlation plot
png("c:\\temp\\Corr.png")
chart.Correlation(merged[,c("spy.Return","goog.Return")],histogram=TRUE,pch="+")
dev.off()
First, the chart.CumReturns() produces a nice graph. Better than I was able to do with plot().

Second, the char.Correlation() also gives a neat output. I would really like to find a comparable method to produce the alpha ellipses that I did in SAS.

Third, I cannot find a good method that is comparable to PROC CORR. Can I get a good output with both correlation, covariance, mean, std, etc? Please, let me know.
reg = lm(merged[,"goog.Return"]~merged[,"spy.Return"])

#Create the confidence interval
newx = merged[,"spy.Return"]
prd = predict(reg,newdata=newx,interval="confidence",level=.95, type="response")

#Print the Regression Summary
summary(reg)
Linear Regression seems pretty easy. It took me a while to decipher the R help to figure out the confidence interval stuff. Again, if there is a way to produce a rich set of output from a regression like SAS and PROC REG, please show me.

Here is the R output:
Call:
lm(formula = merged[, "goog.Return"] ~ merged[, "spy.Return"])

Residuals:
 Min 1Q Median 3Q Max -0.089348 -0.005702 -0.000083 0.005513 0.116929

Coefficients:
 Estimate Std. Error t value Pr(>|t|) (Intercept) -0.0003841 0.0006424 -0.598 0.55 merged[, "spy.Return"] 0.9641218 0.0509346 18.929 <2e-16 ***

--- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.0137 on 453 degrees of freedom (1 observation deleted due to missingness)

Multiple R-squared: 0.4416, Adjusted R-squared: 0.4404

F-statistic: 358.3 on 1 and 453 DF, p-value: < 2.2e-16
Matches SAS. It's not exact, but very close.  That's good.
#Chart the regression
png("c:\\temp\\Regression.png")
chart.Regression(merged[,"goog.Return",drop=FALSE],
merged[,"spy.Return",drop=FALSE],
fit=c("linear"),
xlab="SPY Return",

lines(newx$spy.Return,prd[,2],col="Red",lty=2) lines(newx$spy.Return,prd[,3],col="Red",lty=2)
dev.off()
Using the chart.Regression() from PerformanceAnalytics. The fit interval looks suspect. Maybe I did something wrong.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...