EDIT: I am aware of some browsers failing to load the github code below. I will try to improve this as soon as possible. Until then it may work better on www.joesdatadiner.com than at https://www.r-bloggers.com/. Finally all the code is availa...

It used to be that the one of the first decisions to make when learning to program was between compiled (e.g. C or FORTRAN) and interpreted (e.g. Python) languages. In my opinion these days one would have to be a masochist to learn with a compiled language: the extra compilation time and obscure errors are

Here are answers to some more overflow questions asked during our webnar about SciDB-R. (First batch of answers is here.) Can you advise us on hardware configurations for SciDB? Yes. See this blog post. Can you compare HDFS and SciDB? A comparison is d...

Google Summer of Code has now opened for student applications, and the R Project has once again been selected as a mentoring organization. I’ve discussed before that a variety of mentors have proposed a number of projects for students to work on during this summer, but I wanted to emphasize some points about the schedule. The deadline for

by Joseph Rickert Baseball fans have been serious about statistics since Carl Pearson was a young man (although I doubt that Carl followed the game). It is not clear, though, exactly when baseball statisticians moved from doing descriptive stats into predictive analytics. In his book Super Crunchers, Ian Ayers credits Bill James of Baseball Abstract fame for getting this...

In addition to its weekly mathematics puzzles, Le Monde is now publishing a series of vulgarisation books on mathematics, under the patronage of Cédric Villani. Jean-Michel Marin brought me two from the series, one on the golden number and one on Pythagoras’ theorem. (This is actually a translation of a series published by El Pais

I promised in Episode 166 that I’d review this beer with a bit more detail than my usual quick spiel on the show, so allow me to present: End of the World Midnight Wheat! An ale brewed with “midnight wheat, chocolate malt, chili and spice” from Shock Top (aka Anheuser-Busch) First and foremost a special

The deadline for my book on R is fast approaching, so naturally I’m in full procrastination mode. So much so that I’ve spent this evening creating a brainfuck interpreter for R. brainfuck is a very simple programming language: you get an array of 30000 bytes, an index, and just 8 eight commands. You move the

We are pleased to announce that Revolution R Enterprise Release 6.2 is available to new subscribers today. This new software release from Revolution Analytics includes a number of key new features: Support for open source R 2.15.3, the latest stable release of R. Since Release 2.14.2, the R Project has added 89 new features, 11 performance enhancements and 139...

We share our opinion that = should be preferred to the more standard <- for assignment in R. This is from a draft of the appendix of our upcoming book. This has the risk of becoming an R version of Javascript’s semicolon controversy, but here you have it. R has five common assignment operators: “=“, Related posts:

In loss forecasting, it is often necessary to disaggregate annual losses into each quarter. The most simple method to convert low frequency to high frequency time series is interpolation, such as the one implemented in EXPAND procedure of SAS/ETS. In the example below, there is a series of annual loss projections from 2013 through 2016.

If you didn't manage to catch Coursera's Data Analysis course, don't despair. Instructor Jeff Leek has made the course videos available on YouTube, which you can review at your leisure to learn how to plan, carry out, and communicate analyses of real data sets with R. (The course assumes you already have familiarity with R, so if you're new...

Here are my slides from a short introductory seminar on R (essentially going through part I of the R tutorial) last week. As magic lantern pictures go, they’re hideously ugly, but they were mostly there for future reference. Most of the seminar was spent showing RStudio. This Friday, we’ll practice some uses of qplot and make

Earlier this month we blogged about Harvard Professors Gary King and Stuart Shieber providing advice to graduate students about open access, dissertations, and journal publishing. We also mentioned some of the great initiatives that facilitate open access publishing in the statistics community, like the Journal of Statistical Software (JSS), The R Journal and arxiv.org. The ...

For some cryptic reason I needed a function that calculates function values on sliding windows of a vector. Googling around soon brought me to ‘rollapply’, which when I tested it seems to be a very versatile function. However, I wanted to code my own version just for vector purposes in the hope that it may

Our 5th Cologne R user group meeting was the best attended meeting so far, with 20 members finding their way to the Institute of Sociology for two talks by Diego de Castillo on shiny and Stephan Holtmeier on cluster analysis, followed by beer and schnitzel at the Lux, a gastropub nearby.ShinyDiego gave an overview of...

Some users had trouble installing the WRS package from R-Forge. Here’s a method that should work automatically and fail-safe: ?View Code RSPLUS# first: install dependent packages install.packages(c("MASS", "akima", "robustbase")) # second: install suggested packages install.packages(c("cobs", "robust", "mgcv", "scatterplot3d", "quantreg", "rrcov", "lars", "pwr", "trimcluster", "parallel", "mc2d", "psych", "Rfit")) # third: install WRS install.packages("WRS", repos="http://R-Forge.R-project.org",

In an earlier post, I introduced the golden section search method – a modification of the bisection method for numerical optimization that saves computation time by using the golden ratio to set its test points. This post contains the R function that implements this method, the R functions that contain the 3 functions that were

Introduction The first algorithm that I learned for root-finding in my undergraduate numerical analysis class (MACM 316 at Simon Fraser University) was the bisection method. It’s very intuitive and easy to implement in any programming language (I was using MATLAB at the time). The bisection method can be easily adapted for optimizing 1-dimensional functions with