On a recent flight I was bored waiting for the plane to land and I tried out the electronic sudoku game that they had offered. I found the game surprisingly interesting as I realized that it is far more entertaining when you cannot use paper or p...

R package developer (and R-bloggers editor) Tal Galili just published the answers to a question many R users have asked: which are the most popular R packages? He wrote some R code to rank the top 100 packages by number of downloads. Here's the top 10: The source data are the download logs from the RStudio CRAN mirror, whose...

Latent class analysis is a useful tool that is used to identify groups within multivariate categorical data. An example of this is the likert scale. In categorical language these groups are known as latent classes. As a simple comparison this can be compared to the k-means multivariate cluster analysis. There are several key differences between the

Interval estimation of the population mean can be computed from functions of the following R packages:stats - contains the t.test;TeachingDemos - contains the z.test; and,BSDA - contains the zsum.test and tsum.test.The t.test of the stats package is a ...

Interval estimation of the population mean can be computed from the functions of the following R packages:stats - contains the t.testTeachingDemos - contains the z.testBSDA - contains the zsum.test and tsum.testThe t.test of the stats package is a stud...

The creator of S language which R is derived from John Chambers said in one of his books Software for data analysis programming with R: ...This places an obligation on all creators of software to program in such away that the computations ca...

In this post, I use a Shiny app in R to determine the best possible players to pick in a fantasy football auction draft. The app takes projections from FantasyPros, The post Win Your Fantasy Football Auction Draft: Calculate the Optimal Players to Draft with this Shiny App in R appeared first on Fantasy Football Analytics.

Bill Grosso presented a fascinating webinar about the video gaming industry today, Knowing How People are Playing Your Game Gives You the Winning Hand. He described how over the past three years, game studios have switched from viewing analytics as a primarily descriptive tool to deploying modern data collection practices, machine learning toolkits, and statistical methods to gain a...

OBS: This is a full translation of a portuguese version. In many different types of experiments, with one or more treatments, one of the most widely used statistical methods is analysis of variance or simply ANOVA . The simplest ANOVA can be called “one way” or “single-classification” and involves the analysis of data sampled from The post ANOVA...

(This article was first published on R-statistics blog » RR-statistics blog, and kindly contributed to R-bloggers) What are the top 100 (most downloaded) R packages in 2013? Thanks to the recent release of RStudio of their “0-cloud” CRAN log files (but without including downloads from the primary CRAN mirror or any of the 88 other CRAN mirrors), we can now answer this question...

Google's prediction API offers a blackbox way of doing some prediction. They had advertised an R package, but it doesn't seem to work with the new version of the prediction API or their OAuth2 authentication mechanism. So, in an effort to check out the...

This article from my other blog may be of interest to readers of this blog: http://seriousstats.wordpress.com/2013/04/18/using-multilevel-models-to-get-accurate-inferences-for-repeated-measures-anova-designs/

A rather dull puzzle this week: Show that, for any integer y, (√3-1)2y+(√3+1)2y is an integer multiple of a power of two. I just have to apply Newton’s binomial theorem to obtain the result. What’s the point?! Filed under: Books, Kids, R Tagged: Binomial theorem, Isaac Newton, Le Monde, mathematical puzzle

Another maintenance release of inline is now on CRAN and in already included in Debian. This release was triggered by a change in the development version of R which removed an argument to package.skeleton(). The complete NEWS entry is below. Chan...

How does overdispersion of infections affect the behavior of the multiple-infection model? I redefine the model to account for overdispersion, assuming the same overdispersion occurs in both age classes. The parameter varies inversely with the degree of overdispersion. Again, the classes are demographically identical, and infection affects mortality but not growth: \[\begin{aligned} \frac{dJ}{dt}...

by Joseph Rickert Quandl.com, the open source website for financial data, made rapid progress earlier this year in becoming an R friendly source for financial time series data. Tammer Kamel, Quandl’s founder introduced the site on Revolutions blog in late February as a “search engine” for numerical data and explained how Quandl’s “Q-bot” can take data from almost any...

It has taken a long time, but cran2deb4ubuntu has been updated for R 3.0.1. Over 1000 R packages are available as .deb files (with dependicies) for Ubutnu 13.04 (raring), 12.10 (quantal) and 12.04 (precise). These packages can be found at the c2d4u PPA. Instructions on how to install the PPA can be found on this...

Since tonight kicks off Game 1 of the Stanley Cup Finals, I thought it would be fun to do a very quick and dirty cluster analysis of the league based on regular season performance. Tonight, the Chicago Blackhawks square off against my hometown team, the Boston Bruins. Even though it was a lockout-shortened season, the

We typically start with the data matrix, a rectangular array of rows and columns. If we type its name on the R command line, it will show itself. But the data matrix is hard to read, even when there are not many rows or columns. The heat map is a visual alternative. All you need is the R function...

I recently entered kaggle titanic learning competition for fun and to see where my out of the box utilization of random forest would rank me (303 out of 5,882). It was interesting to see that much of the scoring differentiation came from score imputation, that is filling missing values based on other data. For example, we might have

Image by Jan Zander Our mantra here at Quandl is making data easy to find and easy to use. Following that goal we (and subsequently the community) have created packages that integrate Quandl’s API into a number of software platforms. Today we’ll take a look at R. R is a free statistical computing language created