My R year

December 23, 2012
By
My R year

End-of-year posts are corny but, what the heck, I think I can let myself delve in to corniness once a year. The following code gives a snapshot of what and how was R for me in 2012. outside.packages.2012 <- list(used.the.most = c('asreml', 'ggplot2'), largest.use.decline = c('MASS', 'lattice'), same.use = c('MCMCglmm', 'lme4'), would.like.use.more = 'JAGS')  

Read more »

Data Import Efficiency – A Case in R

December 23, 2012
By
Data Import Efficiency – A Case in R

Below is a piece of R snippet comparing the data import efficiencies among CSV, SQLITE, and HDF5. Similar to the case in Python posted yesterday, HDF5 shows the highest efficiency.

Read more »

Binary Classification – A Comparison of “Titanic” Proportions Between Logistic Regression, Random Forests, and Conditional Trees

December 23, 2012
By
Binary Classification – A Comparison of “Titanic” Proportions Between Logistic Regression, Random Forests, and Conditional Trees

Now that I’m on my winter break, I’ve been taking a little bit of time to read up on some modelling techniques that I’ve never used before. Two such techniques are Random Forests and Conditional Trees.  Since both can be used … Continue reading →

Read more »

Measuring the Gerrymander with spatstat

December 23, 2012
By
Measuring the Gerrymander with spatstat

Well, to be specific, I mean measuring district compactness (a very interesting subject, see these three articles for starters). There are myriad ways of measuring the “oddness” of a shape, including a comparison of the area of the district to its circumcircle, the moment of inertia of the shape, the probability that a path connecting...

Read more »

R for Dummies – De Vries and Meys (2012)

December 22, 2012
By
R for Dummies – De Vries and Meys (2012)

The for Dummies series has been around since 1991. (A bit of trivia, DOS for Dummies was the first title.) I’ve owned a few books in the series and have been adequately impressed with most of them, but when I learned there was an R for Dummies I was immediately skeptical. Possibly I was skeptical because R has a...

Read more »

R for Dummies – De Vries and Meys (2012)

December 22, 2012
By
R for Dummies – De Vries and Meys (2012)

The for Dummies series has been around since 1991. (A bit of trivia, DOS for Dummies was the first title.) I’ve owned a few books in the series and have been adequately impressed with most of them, but when I learned there was an R for Dummies I was immediately skeptical. Possibly I was skeptical because R has a...

Read more »

R for Dummies – De Vries and Meys (2012)

December 22, 2012
By
R for Dummies – De Vries and Meys (2012)

The for Dummies series has been around since 1991. (A bit of trivia, DOS for Dummies was the first title.) I’ve owned a few books in the series and have been adequately impressed with most of them, but when I learned there was an R for Dummies I was immediately skeptical. Possibly I was skeptical The post R...

Read more »

Joining 2 R data sets with different column names

December 22, 2012
By

Joining or merging two data sets is one of the most common tasks in preparing and analysing data. In fact a Google search returns 253 million results. However most examples assume that the columns that you want to merge by have the same names in both data sets which is often not the case. For example:

Read more »

Convert OpenStreetMap Objects to KML with R

December 22, 2012
By

A quick geo-tip:With the osmar and maptools package you can easily pull an OpenStreetMap object and convert it to KML, like below (thanks to adibender helping out on SO). I found the relation ID by googling for it (www.google.at/search?q=openstreetmap+relation+innsbruck).# get OSM datalibrary(osmar)library(maptools)innsbruck sp_innsbruck # convert to KMLfor( i in seq_along(sp_innsbruck) ) { ...

Read more »

RcppClassic 0.9.3

December 22, 2012
By

Yesterday's release of Rcpp 0.10.2 required a small change to RcppClassic, the package supporting the deprecated older classic Rcpp API defined in the earlier 2005 to 2006 releases. So version 0.9.3 of RcppClassic is now on CRAN. There is no new user...

Read more »

Visualizing Principal Components

December 22, 2012
By
Visualizing Principal Components

Principal Component Analysis (PCA) is a procedure that converts observations into linearly uncorrelated variables called principal components (Wikipedia). The PCA is a useful descriptive tool to examine your data. Today I will show how to find and visualize Principal Components. Let’s look at the components of the Dow Jones Industrial Average index over 2012. First,

Read more »

Basics of Histograms

December 22, 2012
By
Basics of Histograms

Histograms are used very often in public health to show the distributions of your independent and dependent variables.  Although the basic command for histograms (hist()) in R is simple, getting your histogram to look exactly like you want takes g...

Read more »

Get the party started

December 22, 2012
By

Have you already used trees or random forests to model a relationship of a response and some covariates? Then you might like the condtional trees, which are implemented in the party package.In difference to the CART (Classification and Regression ...

Read more »

The definitive guide to plotting confidence intervals in R

December 22, 2012
By
The definitive guide to plotting confidence intervals in R

Here at is.R(), we have produced countless posts that feature plots with confidence intervals, but apparently none of those are easy to find with Google. So, today, for the purposes of SEO, we’ve put “plotting confidence intervals” in the title of our post. We also cannot resist an earnest plea from our...

Read more »

Chocolate and nobel prize – a true story?

December 22, 2012
By
Chocolate and nobel prize – a true story?

Few of us can resist chocolate, but the real question is: should we even try to resist it? The image is CC by Tasumi1968. As a dark chocolate addict I was relieved to see Messerli's ecological study on chocolate consumption and the...

Read more »

Learn to use R for FREE with Coursera

December 21, 2012
By
Learn to use R for FREE with Coursera

Coursera is offering free courses about R among other interesting subjects. The first one on the application of R in financial econometrics is happening this week (but you can still enroll). There are two more courses starting in January 2013 are more about using R to analyse the data. The differences between the two are

Read more »

Simple data simulator for the 2PL model

December 21, 2012
By

The function: This is a very simple data simulator for a 2PL Model. This is just to get you started, from here is easy to add function parameters for indicating item locations and slopes or person distribution characteristics. The function accepts on...

Read more »

Rcpp 0.10.2

December 21, 2012
By

Relase 0.10.2 of Rcpp provides the second update to the 0.10.* series, and has arrived on CRAN and in Debian. It brings another great set of enhancements and extensions, building on the recent 0.10.0 and 0.10.1 releases. The new Rcpp attributes were rewritten to not require Rcpp modules (as we encountered on issue with exceptions on Windows when built this...

Read more »

R for inquisition

December 21, 2012
By
R for inquisition

A post on high-dimensional arrays by @isomorphisms reminded me of APL and, more generally, of matrix languages, which took me back to inquisitive computing: computing not in the sense of software engineering, or databases, or formats, but of learning by poking problems through a computer. I like languages not because I can get a job

Read more »

Create optical illusions with R

December 21, 2012
By
Create optical illusions with R

I love optical illusions (like this and this and these), not just because they're fun, but also beca...

Read more »

A simple web application using Rook

December 21, 2012
By
A simple web application using Rook

by Ben Ogorek I'm grateful to Rook for helping me, a simple statistician, learn a few fundamentals of web technology. For R web application development, there are increasingly polished methods available (most notably Shiny ), but you can build one...

Read more »

Generating a non-homogeneous Poisson process

December 21, 2012
By
Generating a non-homogeneous Poisson process

Consider a Poisson process , with non-homogeneous intensity . Here, we consider a deterministic function, not a stochastic intensity. Define the cumulated intensity in the sense that the number of events that occurred between time  and  is a random variable that is Poisson distributed with parameter  . For example, consider here a cyclical Poisson process, with intensity lambda=function(x) 100*(sin(x*pi)+1) To compute...

Read more »

Computing an empirical pFDR in R

December 21, 2012
By

The positive false discovery rate (pFDR) has become a classical procedure to test for false positive. It is one of my favourite because it rely on a re-sampling approach.I base my implementation on John Storey PNAS paper and the technical report he published with Rob Tibshirani while at Stanford (I find the technical report...

Read more »

Working with geographical Data. Part 1: Simple National Infomaps

December 21, 2012
By
Working with geographical Data. Part 1: Simple National Infomaps

There is a popular expression in my country called “Gastar polvora en chimangos”, whose translation in English would be “spending gunpowder in chimangos”. Chimango is a kind of bird whose meat is useless for humans. So “spending gunpowder in chimangos” … Continue reading →

Read more »

Beautiful network diagrams with ggplot2

December 21, 2012
By
Beautiful network diagrams with ggplot2

I don’t usually like describing my own work as “beautiful,” but with your permission I will make an exception today. There have been some requests for scripts illustrating the plotting of network diagrams with ggplot2, and today (for the winter solstice) we’re bringing you a really nice-looking way of doing just that. In fact, this Gist...

Read more »

Y2K38: Our Own Mayan Calendar…Again

December 21, 2012
By
Y2K38: Our Own Mayan Calendar…Again

It’s not quite the end of the world as we know it.  We made it through December 21, 2012 unscathed. It’s not going to be the last time we will make it through such a pseudo-calamity.  After all we have built our own end of the world before (e.g. Y2K). Next up January 19, 2038.

Read more »

Italian BioR Day: leave a comment

December 21, 2012
By

The Italian BioR Day took place on November 30 and almost sixty R enthusiastic joining the event! Thanks to all the participants and a special thanks to the speakers who shared their knowledge with us and to the Parco Tecnologico … Continue reading →

Read more »

Simple data simulator for the Rasch Model

December 21, 2012
By

The function: This is a very simple data simulator for the Rasch Model.This is just to get you started, from here is easy to add function parameters for indicating item locations or person distribution characteristics. The function accepts only two p...

Read more »

Removing Records by Duplicate Values in R – An Efficiency Comparison

December 20, 2012
By
Removing Records by Duplicate Values in R – An Efficiency Comparison

After posting “Removing Records by Duplicate Values” yesterday, I had an interesting communication thread with my friend Jeffrey Allard tonight regarding how to code this in R, a combination of order() and duplicated() or sqldf(). Afterward, I did a simple efficiency comparison between two methods as below. The comparison result is pretty self-explanatory. In terms

Read more »

Sponsors