**mickey mouse models**, and kindly contributed to R-bloggers)

I’ll begin with a familiar image:

That plot shows the closing values of the S&P 500 index from 1990 until today. It’s a useful representation — at a glance, you can tell when the market rose and fell. That said, it does have some problems: we’re looking at absolute movements in the index, when we really care about relative changes. We don’t want the early 1990s to be dwarfed by the 2000s, so let’s take a look at daily logarithmic returns:

Much better! I find this plot extremely interesting. Each point shows the logarithmic returns from holding the index close-to-close until the next trading day. The return volatility appears to vary dramatically: the mid-1990s and mid-2000s were calm; 2008 was extremely turbulent. I didn’t see that in the first plot, but now it’s quite clear.

Let’s continue with a plot of the 200-day running mean of logarithmic returns. If you bought the index on a given day, what average logarithmic returns would you have seen across the next 200 trading days? Look below:

Are the S&P 500 returns i.i.d.? At a glance, I would say no, since the mean and variance both appear to change across time. I’m sure this can be tested statistically, but I would much prefer to look at these plots!

Here’s my code; suggestions and comments welcome.

# Yahoo finance url for S&P 500 data

str <- sprintf("%s?s=^GSPC&d=7&e=4&f=2011&g=d&a=0&b=3&c=1950",

"http://ichart.finance.yahoo.com/table.csv")

df <- tryCatch(read.csv(url(str)), error = function(e) NA)

names(df) <- tolower(names(df))

df$date <- as.Date(df$date)

df <- df[order(df$date), ]

# Plot prices, daily log returns & running mean of log returns since 1990

start.date <- "1990-01-01"

dev.new(width=12, height=6)

plot(subset(df, date >= start.date)[ , c("date", "close")], type="l",

main="S&P 500", xlab="", col="tomato")

mtext(sprintf("Closing prices since %s", start.date))

mtext(sprintf("Created on %s", Sys.Date()), side=1, line=3, cex=0.60)

savePlot("s&p_500_prices.png")

# Logarithmic returns: p0 * e^r = p1 ; r = ln(p1) - ln(p0)

df$return <- c(diff(log(df$close)), NA)

dev.new(width=12, height=6)

plot(subset(df, date >= start.date)[ , c("date", "return")], type="p",

main="S&P 500", xlab="", col=rgb(0, 75, 0, 75, maxColorValue=255))

mtext(sprintf("Logarithmic close-to-close returns since %s", start.date))

mtext(sprintf("Created on %s", Sys.Date()), side=1, line=3, cex=0.6)

savePlot("s&p_500_daily_returns.png")

# Running average of logarithmic returns

lag <- 200

df$mean.return <- c(diff(c(0, cumsum(df$return)), lag=lag),

rep(NA, (lag - 1))) * (1 / lag)

dev.new(width=12, height=6)

plot(subset(df, date >= start.date)[ , c("date", "mean.return")],

type="l", main="S&P 500", xlab="", col="darkred",

ylab=sprintf("%s-day mean return", lag))

mtext(sprintf("%s-day mean of logarithmic returns since %s",

lag, start.date))

mtext(sprintf("Created on %s", Sys.Date()), side=1, line=3, cex=0.6)

abline(0, 0, lty=2)

savePlot(sprintf("s&p_500_%s_day_mean_returns.png", lag))

**leave a comment**for the author, please follow the link and comment on their blog:

**mickey mouse models**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...