Ok, R is very well-considered in certain respects, but there are also some things annoying me... This time it's scoping...

Ok, R is very well-considered in certain respects, but there are also some things annoying me... This time it's scoping...

This post is the introduction to a series that will illustrate how to backtest the same strategy in Excel and R. The impetus for this series started with this tweet by Jared Woodard at Condor Options. After Soren Macbeth introduced us, Jare...

This post is the introduction to a series that will illustrate how to backtest the same strategy in Excel and R. The impetus for this series started with this tweet by Jared Woodard at Condor Options. After Soren Macbeth introduced us, Jare...

A simple challenge in Le Monde this week: find the group of four primes such that any sum of three terms in the group is prime and the overall sum is minimised. Here is a quick exploration by simulation, using the schoolmath package (with its imperfections): A=primes(start=1,end=53) lengthA=length(A) res=4*53 for (t in 1:10^4){ B=sample(A,4,prob=1/(1:lengthA)) sto=is.prim(sum(B))

For students planning to attend the annual worldwide R user conference, useR! 2011, travel grants are available to help defray the cost of attending the conference in the UK. CRISM is offering bursaries for accommodation and conference fees, and Revolu...

In his detailed research on RSI(2) indicator, MarketSci emphasized several times that the contrarian strategies based on the RSI(2) indicator didn’t start working until the 80s. I remembered this observation recently when I observed another interesting anomaly … In statistics, an important initial step in studying time series data is to consider the auto correlation

Had a mental block today trying to figure out how to get the indices of columns in a data frame given their names. Simple task but difficult to search Google for an answer. Thanks to jashapiro, Matt, and Vince for giving me a heads up on the which() fu...

Had a mental block today trying to figure out how to get the indices of columns in a data frame given their names. Simple task but difficult to search Google for an answer. Thanks to jashapiro, Matt, and Vince for giving me a heads up on the which() fu...

As we dig deeper into Stata or R debate, a few questions have come up.Question 1: One of the things Stata does well is the way it constructs new variables (see example below). How to do this in R? We can rewrite it as-is using for loops in R...

As we dig deeper into Stata or R debate, a few questions have come up.Question 1: One of the things Stata does well is the way it constructs new variables (see example below). How to do this in R? We can rewrite it as-is using for loops in R...

I would like to thank Tal Galili for establishing and maintaining the blog aggregator at R-bloggers. This site has been added to their directory and new posts which are tagged with R will now appear on their feed. http://www.r-bloggers.com/ In part a, I presented a series of barplots which showed that the plurality of police

My friend Michael Bommarito has been doing the data community quite a service, capturing and sharing all of the traffic on Twitter related to the Iranian protests. Specifically, he has all of the tweets containing the #25bahman hast-tag, and made them available for anyone to download. I am unable to resist the temptation to explore a

Something like this probably already exists in an R package somewhere out there, but I needed a function to summarize how much missing data I have in each variable of a data frame in R. Pass a data frame to this function and for each variable it'll give you the number of missing values, the total N, and the...

Something like this probably already exists in an R package somewhere out there, but I needed a function to summarize how much missing data I have in each variable of a data frame in R. Pass a data frame to this function and for each variable it'll give you the number of missing values, the total N, and the...

A new version 1.1.2 of Conrad Sanderson's Armadillo templated C++ library for linear algebra came out a couple of days ago. This has now been wrapped into a new version 0.2.12 of RcppArmadillo, our Rcpp-based integration into R. The short NEWS fil...

The author of the ggplot2 graphics package for R, Hadley Wickham, is looking for feedback from ggplot2 users. If you've used ggplot2, fill out his short survey at the link below. WuFoo: ggplot2 survey

Twitter played a significant role in the recent uprising in Egypt, with protesters communicating via tweets marked with the #25bahman hastag (February 14 in the arabic calendar) to plan and rally for the demonstration. Michael Bommarito downloaded all such tweets and plotted their frequency over time using R's ggplot2 library: Not surprisingly, the activity peaked on February 14. The...

TCRUG will be having a meeting TONIGHT (2/16) at 5:30 PM. We will meet in ROOM 29 in Willey Hall. Willey Hall is located on the West Bank of the Minneapolis campus. See the Google map at http://goo.gl/tnRnU. Erik Iverson will be giving a talk ...

TCRUG will be having a meeting TONIGHT (2/16) at 5:30 PM. We will meet in ROOM 29 in Willey Hall. Willey Hall is located on the West Bank of the Minneapolis campus. See the Google map at http://goo.gl/tnRnU. Erik Iverson will be giving a talk ...

Buried in the London Datastore are the population estimates for each of the London Boroughs between 2001 – 2030. They predict a declining population for most boroughs with the exception of a few to the east. I was surprised by this general decline and also the numbers involved- I expected larger changes from one year to ...

In recent months, there has been a series of high profile incidents in the United States where police officers were killed. While such events are unfortunate, the data suggests that it is extremely rare for an officer to be harmed or killed while on duty. In this post, I examine whether there are significant regional

Getting more into mixed models, I’ve been playing around with both nlme::lme and lme4::lmer. http://tolstoy.newcastle.edu.au/R/e2/help/06/10/3345.html was quite a good post at explaining the differences, which from what I gather is largely performance based when using crossed or partially crossed models. In the models I am tinkering with at the moment I am noticing differences in