Learning R: Project 1, Part 2
[This article was first published on Adventures in Statistical Computing, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
So it’s been a week since I started down this path. I worked most of this out over last weekend, went to a conference, had hectic week at work, and then realized I lost my work. Gah.Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I’ll be posting my general thoughts on R later. Mostly it seems to be a neat language. Lots of ways to do things. The ability to create output seems limited. I played with a number of things trying to create rich HTML output like I did with SAS. R2HTML might be what I need; I couldn’t get it to work.
So here is what I have
require(fImport)
require(PerformanceAnalytics)
These two packages seem to do a lot of what I need. PerformanceAnalytics has a wealth of charting tools for financial data.
#Function to load stock data into a Time Series objectI learned a lot about data handling in R putting this function together.
importSeries = function (symbol,from,to) {#Read data from Yahoo! Finance
input = yahooSeries(symbol,from=from,to=to)
#Character Strings for Column Names
adjClose = paste(symbol,”.Adj.Close”,sep=””)
inputReturn = paste(symbol,”.Return”,sep=””)
CReturn = paste(symbol,”.CReturn”,sep=””)
#Calculate the Returns and put it on the time series
input.Return = returns(input[,adjClose])
colnames(input.Return)[1] = inputReturn
input = merge(input,input.Return)
#Calculate the cumulative return and put it on the time series
input.first = input[,adjClose][1]
input.CReturn = fapply(input[,adjClose],FUN=function(x) log(x) – log(input.first))
colnames(input.CReturn)[1] = CReturn
input = merge(input,input.CReturn)
#Deleting things (not sure I need to do this, but I can’t not delete things if
# given a way to…
rm(input.first,input.Return,input.CReturn,adjClose,inputReturn,CReturn)
#Return the timeseries
return(input)
}
#Load SPY dataNothing fancy here. The merge() function is nice, but I have no idea how to do anything but the “full” join that it defaults to. If anyone knows of a good tutorial on doing more advanced SQL style joins, please let me know.
spy = importSeries(“spy”,from=”2010-01-01″,to=”2011-10-22″)
#Load Google data
goog = importSeries(“goog”,from=”2010-01-01″,to=”2011-10-22″)
#merge the time series
merged = merge(spy,goog)
#Chart the Cumulative ReturnsFirst, the chart.CumReturns() produces a nice graph. Better than I was able to do with plot().
png(“c:\\temp\\Returns_r.png”)
chart.CumReturns(merged[,c(“spy.Return”,”goog.Return”),drop=FALSE],
main=”Total Returns SPY vs Google”,
legend.loc=”topleft”)
dev.off()
#Create the Correlation plot
png(“c:\\temp\\Corr.png”)
chart.Correlation(merged[,c(“spy.Return”,”goog.Return”)],histogram=TRUE,pch=”+”)
dev.off()
Second, the char.Correlation() also gives a neat output. I would really like to find a comparable method to produce the alpha ellipses that I did in SAS.
Third, I cannot find a good method that is comparable to PROC CORR. Can I get a good output with both correlation, covariance, mean, std, etc? Please, let me know.
#Regress Google on SPYLinear Regression seems pretty easy. It took me a while to decipher the R help to figure out the confidence interval stuff. Again, if there is a way to produce a rich set of output from a regression like SAS and PROC REG, please show me.
reg = lm(merged[,”goog.Return”]~merged[,”spy.Return”])
#Create the confidence interval
newx = merged[,”spy.Return”]
prd = predict(reg,newdata=newx,interval=”confidence”,level=.95, type=”response”)
#Print the Regression Summary
summary(reg)
Here is the R output:
Call:Matches SAS. It’s not exact, but very close. That’s good.
lm(formula = merged[, “goog.Return”] ~ merged[, “spy.Return”])
Residuals:
Min 1Q Median 3Q Max -0.089348 -0.005702 -0.000083 0.005513 0.116929
Coefficients:
Estimate Std. Error t value Pr(>|t|) (Intercept) -0.0003841 0.0006424 -0.598 0.55 merged[, “spy.Return”] 0.9641218 0.0509346 18.929 <2e-16 ***
— Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.0137 on 453 degrees of freedom (1 observation deleted due to missingness)
Multiple R-squared: 0.4416, Adjusted R-squared: 0.4404
F-statistic: 358.3 on 1 and 453 DF, p-value: < 2.2e-16
#Chart the regressionUsing the chart.Regression() from PerformanceAnalytics. The fit interval looks suspect. Maybe I did something wrong.
png(“c:\\temp\\Regression.png”)
chart.Regression(merged[,”goog.Return”,drop=FALSE],
merged[,”spy.Return”,drop=FALSE],
fit=c(“linear”),
main=”Google ~ SPY”,
xlab=”SPY Return”,
ylab=”Google Return”)
#add the confidence interval
lines(newx$spy.Return,prd[,2],col=”Red”,lty=2)
lines(newx$spy.Return,prd[,3],col=”Red”,lty=2)
dev.off()
To leave a comment for the author, please follow the link and comment on their blog: Adventures in Statistical Computing.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.