2311 search results for "map"

Resampling data in Hadoop with RHadoop

February 27, 2013
By

On Revolution Analytics partner Cloudera's blog, Uri Laserson has posted an excellent guide to resampling from a large data set in Hadoop. Resampling is an important step in fitting ensemble models (including random forests and other bagging techniques), and Uri provides a step-by-step guide to implementing resampling methods using RHadoop. He provides the complete map-reduce code in the R...

Read more »

Installing Pandoc from R (on Windows) – using the {installr} package

February 27, 2013
By
Installing Pandoc from R (on Windows) – using the {installr} package

The R blogger Rolf Fredheim has recently wrote a great piece called “Reproducible research with R, Knitr, Pandoc and Word“, where he advocates for Pandoc as an essential part of reproducible research workflow in R, in helping to turn documents …Read more »

Read more »

Fast factor generation with Rcpp

February 27, 2013
By
Fast factor generation with Rcpp

Recall that factors are really just integer vectors with ‘levels’, i.e., character labels that get mapped to each integer in the vector. How can we take an arbitrary character, integer, numeric, or logical vector and coerce it to a factor with Rcpp? It’s actually quite easy with Rcpp sugar: #include <Rcpp.h> using namespace Rcpp; template <int RTYPE> IntegerVector fast_factor_template( const Vector<RTYPE>& x )...

Read more »

Fast factor generation with Rcpp

February 27, 2013
By
Fast factor generation with Rcpp

Recall that factors are really just integer vectors with ‘levels’, i.e., character labels that get mapped to each integer in the vector. How can we take an arbitrary character, integer, numeric, or logical vector and coerce it to a factor with Rcpp? It’s actually quite easy with Rcpp sugar: #include <Rcpp.h> using namespace Rcpp; template <int RTYPE> IntegerVector fast_factor_template( const Vector<RTYPE>& x )...

Read more »

New ways to Hadoop with R

February 26, 2013
By

Today, there are two main ways to use Hadoop with R and big data: 1. Use the open-source rmr package to write map-reduce tasks in R (running within the Hadoop cluster - great for data distillation!) 2. Import data from Hadoop to a server running Revolution R Enterprise, via Hbase, ODBC (for high-performance Hadoop/SQL interfaces), or streaming data direct...

Read more »

R/ggplot2 tip: aes_string

February 25, 2013
By
R/ggplot2 tip: aes_string

I’m a big fan of ggplot2. Recently, I ran into a situation which called for a useful feature that I had not used previously: aes_string. Imagine that you have data consisting of observations for several variables – let’s say A, B, C – where each observation is from one of two groups – call them

Read more »

Reproducible research with R, Knitr, Pandoc and Word

February 25, 2013
By

Add references and a style sheet Below I briefly outline why Pandoc is an essential part of my research workflow, and demonstrate how to seamlessly integrate it with a bibliographic system and code written in R to produce high quality word or pdf documents. I also...

Read more »

Ten Things the Emacs Social Science Starter Kit gives you

February 24, 2013
By

I recently made some updates to the Emacs Social Science Starter Kit. I maintain the SSSK for my own convenience, but other people have found it useful as well. By now there are a lot of little bits and pieces in the kit, so I thought it might be usefu...

Read more »

Dynamic community occupancy modeling with R and JAGS

February 24, 2013
By
Dynamic community occupancy modeling with R and JAGS

This post is intended to provide a simple example of how to construct and make inferences on a multi-species multi-year occupancy model using R, JAGS, and the ‘rjags’ package. This is not intended to be a standalone tutorial on dynamic community occupancy modeling. Useful primary literature references include MacKenzie et al. (2002), Kery and Royle (2007), Royle and Kery...

Read more »

Earthquakes in Netherlands

February 24, 2013
By
Earthquakes in Netherlands

In the Netherlands we have Natural Gas. Unfortunately winning this gas seems to cause some quakes. As quakes go, they are not strong. However, our buildings are not made to resist quakes, before 1986 they were unheard of, so there is some damage. It is now predicted they could get stronger and more frequent. This caused a bit of a...

Read more »