Le Monde puzzle [#827]

July 2, 2013
Back to R (!) for the current Le Monde puzzle: Given an unknown permutation of the set {1,…,6}, written on the faces of a cube, there exist a sequence of summits such that increasing by one unit the three numbers of the faces sharing the successive summits in the sequence leads to identical values over

Scaling the R ecosystem: Possible Directions for Improving Dependency Versioning

July 2, 2013
A paper published today in The R Journal discusses a fundamental limitation affecting reliability and reproducibility of R code. It explains how lack of dependency versioning causes R based applications break down, Sweave documents to stop working and CRAN to hit scaling problems. The paper suggests several solutions inspired by other open-source communities that could ...

A Brief Look at Mixture Discriminant Analysis

July 2, 2013
Lately, I have been working with finite mixture models for my postdoctoral work on data-driven automated gating. Given that I had barely scratched the surface with mixture models in the classroom, I am becoming increasingly comfortable with them. With this in mind, I wanted to explore their application to classification because there are times when a single class is clearly made up of...

Parse arguments of an R script

July 2, 2013
R can be used also as a scripting tool. We just need to add shebang in the first line of a file (script):#!/usr/bin/Rscriptand then the R code should follow.Often we want to pass arguments to such a script, which can be collected in the script by the c...

Access individual elements of a row while using the apply function on your dataframe (or “applying down while thinking across”)

July 2, 2013
The apply function in R is a huge work-horse for me across many projects.  My usage of it is pretty stereotypical.  Usually, I use it to make aggregations of a targeted group of columns for every row in a dataframe. … Continue reading →

July 2, 2013
Like your .bashrc, .vimrc, or many other dotfiles you may have in your home directory, your .Rprofile is sourced every time you start an R session. On Mac and Linux, this file is usually located in ~/.Rprofile. On Windows it's buried somewhere in the R...

There is definitely R in July

July 1, 2013
The useR!2013 conference in Albacete, Spain, will commence next Wednesday, 10 July, and on the day before Diego and I will give a googleVis tutorial. The following Monday, 15 July, the first R in Insurance event will take place at Cass Business School ...

Some Common Approaches for Analyzing Likert Scales and Other Categorical Data

July 1, 2013
$Some Common Approaches for Analyzing Likert Scales and Other Categorical Data$

Analyzing Likert scale responses really comes down to what you want to accomplish (e.g. Are you trying to provide a formal report with probabilities or are you trying to simply understand the data better). Sometimes a couple of graphs are sufficient and a formalize statistical test isn’t even necessary. However, with how easy it is

integral priors for binomial regression

July 1, 2013
Diego Salmerón and Juan Antonio Cano from Murcia, Spain (check the movie linked to the above photograph!), kindly included me in their recent integral prior paper, even though I mainly provided (constructive) criticism. The paper has just been arXived. A few years ago (2008 to be precise), we wrote together an integral prior paper, published

Using ESS-Remote

July 1, 2013
If you use R and ssh into other machines a lot, e.g. for doing some big data stuff on ec2, ess-remote is a great tool. Just use M-x ssh to ssh into the remote machine, then launch R. Now just M-x ess-remote and you can use the R process just like a local process! Productivity win. Also see

Maximum Entropy Bootstrap Rescale and Symmetrize

July 1, 2013
R code for changing scale without changing mean or to make a probability distribution symmetric. These are commonly encountered problems by R programmers. We provide code for both of these tasks in the context of maximum entropy bootstrap (meboot) package in R.

OpenAnalytics @ UseR 2013: What’s on the Program?

July 1, 2013
Monday 1 July 2013 - 22:37 OpenAnalytics is once more proud sponsor of the yearly R User Conference and sent a strong delegation to present some of its recent work. On Tuesday July 9 Tobias Verbeke and Stephan Wahlbrink give a pre-conference...

Power and sample size calculator for mitochondrial DNA association studies (Shiny)

July 1, 2013
The functions detailed inside the piece of code below (in a Gist) has been useful for me when I had to calculate many possible scenarios of statistical power and sample size. The formulae were taken from the article of Samuels … Sigue leyendo →

Web Analytics Visualization through ggplot2

July 1, 2013
During our last webinar, we covered some of the basic ideas behind ggplot2, the R Visualization package by Dr. Hadley Wickham. In this blog post I will walk through the example that I covered during the webinar. In order to carry out the examples yourself, you may download the dummy datasets from this link Creating

R and PostgreSQL – using RPostgreSQL and sqldf

July 1, 2013
PostgreSQL and R can often be used together for data analysis - PostgreSQL as database engine and R as statistical tool. In this article you will learn how to access data stored in PostgreSQL database and how to write the data back using RPostgreSQL an...

Monitoring an ETF Portfolio in R

July 1, 2013
Adam Duncan Also avilable on R-bloggers.com Some time ago, I read an interesting article about an interview with David Swensen, the renownd money manager from Yale University’s investment office. He’s quite famous and is considered by many to be the architect of the modern “endowment portfolio.” The point of the article was to suggest a way for ordinary...

analyze the united states decennial census public use microdata sample (pums) with r and monetdb

July 1, 2013
during his tenure as secretary of state, thomas jefferson oversaw the first american census way back in 1790.  some of my countrymen express pride that we're the oldest democracy, but my heart swells with the knowledge that we've got the world's o...

Exploratory Data Analysis – Kernel Density Estimation and Rug Plots on Ozone Data in New York and Ozonopolis

For the sake of brevity, this post has been created from the second half of a previous long post on kernel density estimation.  This second half focuses on constructing kernel density plots and rug plots in R.  The first half focused on the conceptual foundations of kernel density estimation. Introduction This post follows the recent

Using R to Produce Scalable Vector Graphics for the Web

June 30, 2013
Statistical software is normally used during the analysis stage of a project and a cleaned up static graphic is created for the presentation.  If the presentation is in web format then there are some considerations that are needed. The trick is to find ways to implement those graphs in that web format so the graph

How pqR makes programs faster by not doing things

June 30, 2013
One way my faster version of R, called pqR (see updated release of 2013-06-28), can speed up R programs is by not even doing some operations. This happens in statements like for (i in 1:1000000) ..., in subscripting expressions like v, and in logical expressions like any(v>0) or all(is.na(X)). This is done using pqR’s internal “variant result” mechanism, which is

June 30, 2013
Tomorrow (July 1, 2013), Google Reader will retire. I can imagine the shock this will be for the 5,821 followers of this site who uses Google Reader in order to followup on news and tutorials from the global R world. If you are have considering a more committed relationship with this site and didn’t know what to do –...

Faster calculation

June 30, 2013
Last week I decided to speed up my distribution fitting functions of two weeks ago. These were bold words. My try of Rcpp was a failure. Just plain optimization helped a bit better. Using the compiler package added a bit. (the compiler package does not...

Learning R: Parameter Fitting for Models Involving Differential Equations

June 30, 2013
$Learning R: Parameter Fitting for Models Involving Differential Equations$

It looks like MATLAB, Octave and Python seem to be the preferred tools for scientific and engineering analysis (especially those involving physical models with differential equations). However as part of my learning R experience, I wanted to check out some … Continue reading →

An .EPS to .PDF converter (using LaTeX!)

June 30, 2013
I am about to go on a short holiday, so I was tidying the code lines I had scattered around before leaving… And I found this: a minimal EPS to PDF converter, which is barely a LaTeX template. It is … Sigue leyendo →

R to GeoJSON

June 30, 2013
GitHub recently introduced the ability to render GeoJSON files on their site as maps here, and recently introduced here support for TopoJSON, an extension of GeoJSON can be up to 80% smaller than GeoJSON, support for other file extensions (.topojson and .json), and you can embed the maps on other sites (so awesome). The underlying...

Shiny Server on CentOS

June 29, 2013
I’ve been enjoying working with Joe Cheng’s Shiny Server and wanted to create a quick step-by-step guide on installing it on an AWS CentOS EC2 instance as the standard Shiny Server instructions assume the typical dependencies are installed: 1. Shiny’s instructions say to install libssl-dev (sudo yum install libssl-dev), here is the CentOS equivalent : sudo yum install openssl-devel

Reproducing R: Scripts, Documents, and Packages

June 28, 2013
I’m sharing the slides from the talk I’ll be giving at the Dallas R Users Group on creating R packages (and other techniques for reproducing R). I’ll introduce R scripts, reproducible R documents, and R packages. We’ll use the knitr, devtools, and roxygen2 packages in the examples. Download the slides here. If you’re unable to

Descending Text in Righthand Margin of R Graphics à la mtext

June 28, 2013
There was an R-help thread in January regarding text in the righthand margin of an R graphic, where the text should be rendered in reading order from top to bottom. The base R function mtext is used to plot text in the margin. But, mtext is only able to render text from left to right

rCharts Remake of NYT

June 28, 2013
For those wondering if I have forsaken finance, the answer is no.  I just don’t think there is much to do in here besides watch and wait.  So more d3 and R as I try to distract myself from doing something dumb in the markets. This time I used rCharts and slidify to  recreate another...