## The EasyABC package for Approximate Bayesian Computation in R

December 2, 2012
By

A comment on a recent post gave me the motivation to try out the new EasyABC package for R, developed by Franck Jabot, Thierry Faure, Nicolas Dumoulin and maintained by Nicolas Dumoulin. Approximate Bayesian Computation (ABC) is a relatively new method that allows treating any stochastic model (IBM, stochastic population model, …) in a statistical…

## Rolling means (and other functions) with zoo

December 2, 2012
By

The zoo package is designed for use with (potentially irregular) time series data. It is widely used for any number of applications, but among its most frequently useful functions are the roll* functions, such as rollmean, rollmedian, rollmax, rollapp...

## Triangle tests

December 2, 2012
By

IntroductionA triangle test is a test beloved by sensory scientists for its simplicity and general use in detecting presence of product differences. The principle is simple. Test subjects get served three samples. One of these contains A, two of these ...

## Financial Turbulence Example

December 1, 2012
By

Today, I want to highlight the Financial Turbulence Index idea introduced by Mark Kritzman and Yuanzhen Li in the Skulls, Financial Turbulence, and Risk Management paper. Timely Portfolio did a great series of posts about Financial Turbulence: Part 1, Part 2, Part 3. As example, I will compute Financial Turbulence for the equal weight index

## Pro Football Data

December 1, 2012
By

I’ve made the acquaintance of a group of data analysts here in the triangle and have agreed to arrange an outing to the Durham Bulls minor league baseball team. Because it’s for stat nerds and because I was curious, I went looking for some baseball data to analyze. I found loads of it here, but

## ExCytR Concept

December 1, 2012
By

The concept is to make a GUI to provide a static and dynamic linking between data and its network representations. Static access will involve making networks based on data and metadata stored in some table or spreadsheet. Dynamic control will provide interactive access to network construction and annotation properties. Together, these will provide rapid generation of information rich networks, based

## Using XML to grab tables from the web

December 1, 2012
By

We’re going to try, this December, to bring you an “Advent CalendaR,” for each of the 24 days leading up to Christmas. Each day, our hope is to unwrap a useful R package to show you a useful or interesting function inside! Today&#821...

## 1 + 1 = 3, the proof in R

December 1, 2012
By

Discussing with some friends the other day, one of them mentioned a supposedly famous quote from Einstein saying that 1 + 1 = 3. Despite my best efforts, Google couldn’t find that particular quote. Anyway, we were trying to understand how could have Einstein come up with that. Our answer was that the proposition “1 + 1 = 3”...

## Trading with Support Vector Machines (SVM)

November 30, 2012
By

Finally all the stars have aligned and I can confidently devote some time for back-testing of new trading systems, and Support Vector Machines (SVM) are the new “toy” which is going to keep me busy for a while. SVMs are a well-known tool from the area of supervised Machine Learning, and they are used both

## Because it’s Friday: Evolution of a research paper about Reddit

November 30, 2012
By

Computer Science PhD student Tim Weninger wrote a 10-page paper for the World Wide Web conference looking at how Reddit users interact on the discussion pages of the social news site. During the process, he saved 463 revisions of the paper in a source-code control system. Then, he wrote a computer program to animate each revision of the paper....

## Real-Time Predictive Analytics with Big Data, and R

November 30, 2012
By

Can R be used for real-time applications? Absolutely! The key is in setting up an technology stack that can support real-time interactions with models developed in R ... and a clear understanding of what "real-time" really means, and its implications in the context of Big Data. I explained how this works in yesterday's webinar, Real-Time Predictive Analytics with Big...

## Lauren Yamane on Matrix Population Models in R

November 30, 2012
By

Last week in Davis R Users’ Group, Lauren Yamane showed us how she created and analyzed a stochastic age-structured population in R. Her examples are below. Her original scripts can be found as *.Rmd files here A note to UC Davis students: This topic and others will be covered by Marissa Baskett and Sebastian Schreiber in...

## Data types part 4: Logical class

November 30, 2012
By

First, an update:  A commentator has asked me to post my code so that it is easier to practice the examples I show here.  It will take me a little bit of time to get all of my code for past posts well-documented and readable, but I have uploa...

## Finding a bright object

November 30, 2012
By

Finally, to return to the challenge I laid out in the first of this series on image manipulation in R: can we do anything as cool in R as can be done in Mathematica? Like, for example, this illustration of how to search images of the surface of Mars...

## edply: combining plyr and expand.grid

November 30, 2012
By

Here’s a code snippet I thought I’d share. Very often I find myself checking the output of a function f(a,b) for a lot of different values of a and b, which I then need to plot somehow. An example: here’s a function that computes the value of a sinusoidal function on a grid of points,

## Another Way to Access R from Python – PypeR

November 29, 2012
By

Different from RPy2, PypeR provides another simple way to access R from Python through pipes (http://www.jstatsoft.org/v35/c02/paper). This handy feature enables data analysts to do the data munging with python and the statistical analysis with R by passing objects interactively between two computing systems. Below is a simple demonstration on how to call R within Python

## Earthquakes Over the Past 7 Days

November 29, 2012
By

This is a brief example using the maps in R and to highlight a source of data.  This is real-time data and it comes from the U.S. Geological Society.  This shows the location of earthquakes with magnitude of at least 1.0 in the lower 48 states. library(maps) library(maptools) library(rgdal) eq = read.table(file="http://earthquake.usgs.gov/earthquakes/catalogs/eqs7day-M1.txt", fill=TRUE, sep=",", header=T) plot.new()

## 2012-11 Generating Animation Sequence Descriptions

November 29, 2012
By

This report describes the animaker package for generating descriptions of animation sequences. An animation sequence is composed by combining atomic animations in series to create sequence animations or in parallel to create track animations. Functions are provided for manipulating animation … Continue reading →

## The tools in an R package developer’s toolbox

November 29, 2012
By

Yihui Xie is the creator of several popular R packages, including knitr, animation and cranvas. In an interview with The Setup, he shares some of the software and hardware he uses in his data-to-day work, including (of course) R: For programming and data analysis, I primarily use R since I'm a statistician. I have created a bunch of R...

## Shiny is the new Cool

November 29, 2012
By

Several of you will probably have tried out the new Shiny package brought to the table by the RStudio guys This is just what I have been looking for and to my mind could provide a quantum leap in the use of R. There have been other packages addressing the need for web user interactivity

## Sorting Within Lattice Graphics in R

November 29, 2012
By

DefaultBy default, lattice sorts the observations by the axis values, starting at the bottom left.For example,library(lattice)colors = c("#1B9E77", "#D95F02", "#7570B3")dotplot(rownames(mtcars) ~ mpg, data = mtcars, col = colors, pch = 1)produc...

## bigglm on your big data set in open source R, it just works – similar as in SAS

In a recent post by Revolution Analytics (link & link) in which Revolution was benchmarking their closed source generalized linear model approach with SAS, Hadoop and open source R, they seemed to be pointing out that there is no 'easy' R open source solution which exists for building a poisson regression model on large datasets.    This post is about...

## RStudio and Rcpp

November 29, 2012
By

Earlier this month a new version of the Rcpp package by Dirk Eddelbuettel and Romain François  was released to CRAN and today we’re excited to announce a new version of RStudio that integrates tightly with Rcpp. First though more about some exciting new features in Rcpp 0.10.1. This release includes Rcpp attributes, which are simple annotations that you add

## Save R objects, and other stuff

November 29, 2012
By

Yesterday, Christopher asked me how to store an R object, in order to save some time, when working on the project. First, download the csv file for searches related to some keyword, via http://www.google.com/trends/, for instance “sunglasses“. Recall that csv files store tabular data (numbers and text) in plain-text form, with comma-separated values (where csv term comes from). Even...

## Confident package releases in R with crant

November 29, 2012
By

I recently released the new lambda.r package on CRAN for functional programming. This was my first new package in quite some …Continue reading »

## Hadley’s guide to high-performance R with Rcpp

November 28, 2012
By

Hadley Wickham has written a comprehensive tutorial for the Rcpp package, which makes it easy to create C++ code embedded in R programs. Hadley explains why you might want to do this in the introduction: Sometimes R code just isn't fast enough - you've used profiling to find the bottleneck, but there's simply no way to make the code...

## Hurricane Sandy Land Wind Speed and Kriging

November 28, 2012
By

NJ Hurricane Sandy Landfall Data These data come from the National Climatic Data Center (NCDC).  Using the above link will download all of the data collected by the NCDC on the day of Hurricane Sandy.  The data can also be obtained directly from the source at http://cdo.ncdc.noaa.gov/qclcd/QCLCD. The purpose of this post is not a discussion

## So, What Are You? ..A Plant? ..An Animal? — Nope, I’m a Fungus!

November 28, 2012
By

Lately I had a list of about 1000 species names and I wanted to filter out only the plants as that is where I come from. I knew that Scott Chamberlain has put together the ritis package which obviously can do such things. However, I knew of ITIS before and was keen...

## Picking Lotto Numbers

November 28, 2012
By

There's not a lot you can do to increase your odds of winning the lottery tonight. With the PowerBall at \$500 million though, a lot of otherwise rational folks might be tempted into playing. For those of you newly tempted, it is important to remember a...