Introducing the BH package

January 31, 2013
By

Earlier today a new package BH arrived on CRAN. Over the years, Jay Emerson, Michael Kane and I had numerous discussions about a basic Boost infrastructure package providing Boost headers for other CRAN packages (and yes, we are talking packages usin...

Read more »

Flowchart: How to learn survey analysis with R

January 31, 2013
By
Flowchart: How to learn survey analysis with R

In a recent talk to the DC R User Group, Anthony Damico presented the following handy flowchart for learning to do survey analysis with R (actually, it's a pretty good flowchart for learning R for any application): Since they're not clickable above, here are the resource links: Learn R by watching two‐minute videos on http://twotorials.com Read the “Getting Started...

Read more »

Data analysis approaches to modeling changes in primary metabolism

January 31, 2013
By
Data analysis approaches to modeling changes in primary metabolism

Read more »

Taking Expectations to the Next Level

January 31, 2013
By
Taking Expectations to the Next Level

Higher Expectations I came across this post on Thursday and found it to be quite interesting. Clearly rental prices vary according to where you live. That isn't too surprising. I started thinking a bit more about it and thought that Boston and the nearby communities would have to...

Read more »

Using R: writing a table with odd lines (again)

January 31, 2013
By
Using R: writing a table with odd lines (again)

Let’s look at my gff track headers again. Why not do it with plyr instead? d_ply splits the data frame by the feature column and applies a nameless function that writes subsets to the file (and returns nothing, hence the ”_” in the name). This isn’t shorter or necessarily better, but it appeals to me.

Read more »

Using Line Segments to Compare Values in R

January 31, 2013
By
Using Line Segments to Compare Values in R

Sometimes you want to create a graph that will allow the viewer to see in one glance:The original value of a variableThe new value of the variableThe change between old and newOne method I like to use to do this is using geom_segment and geom_poin...

Read more »

Scatterplot Matrices

January 31, 2013
By
Scatterplot Matrices

Scatterplot matrices are a great way to roughly determine if you have a linear correlation between multiple variables. This is particularly helpful in pinpointing specific variables that might have similar correlations to your genomic or proteomic data. If you already have data with multiple variables, load it up as described here. If not, no worries

Read more »

How to install packages on R + screenshots

January 31, 2013
By
How to install packages on R + screenshots

Have no fear, the screenshots are here! (For the original tutorial, click here) Method 1 (less typing) Part 1-Getting the Package onto Your Computer Open R via  your preferred method (icon on desktop, Start Menu, dock, etc.) Click “Packages” in the top menu then click “Install package(s)”.  Choose a mirror that is closest to your geographical location. Now

Read more »

Soup up your R environment: how to install packages

January 31, 2013
By
Soup up your R environment: how to install packages

Today we are going to make additions to our R environment in a common process called installing packages. The transition won’t be as long, drastic nor emotional as an episode of Extreme Makeover: Home Edition, but it does add on more capabilities to your R environment. A package is a bunch of codes combined and distributed

Read more »

Using SPARQL Query Libraries to Generate Simple Linked Data API Wrappers

January 31, 2013
By
Using SPARQL Query Libraries to Generate Simple Linked Data API Wrappers

A handful of open Linked Data have appeared through my feeds in the last couple of days, including (via RBloggers) SPARQL with R in less than 5 minutes, which shows how to query US data.gov Linked Data and then Leigh Dodds’ Brief Review of the Land Registry Linked Data. I was going to post a

Read more »

Sorting Numeric Vectors in C++ and R

January 31, 2013
By
Sorting Numeric Vectors in C++ and R

Consider the problem to sort all elements of the given vector in ascending order. We can simply use the function std::sort from the C++ STL. #include <Rcpp.h> using namespace Rcpp; // ] NumericVector stl_sort(NumericVector x) { NumericVector y = clone(x); std::sort(y.begin(), y.end()); return y; } library(rbenchmark) set.seed(123) z <- rnorm(100000) x <- rnorm(100) # check that stl_sort is the same as sort stopifnot(all.equal(stl_sort(x), sort(x))) #...

Read more »

Using Boost via the new BH package

January 31, 2013
By
Using Boost via the new BH package

Earlier today the new BH package arrived on CRAN. Over the years, Jay Emerson, Michael Kane and I had numerous discussions about a basic Boost infrastructure package providing Boost headers for other CRAN packages. JJ and Romain chipped in as well, and Jay finally took the lead by first creating a repo on...

Read more »

repmis: misc. tools for reproducible research in R

January 30, 2013
By

I've started to put together an R package called repmis. It has miscellaneous tools for reproducible research with R. The idea behind the package is to collate commands that simplify some of the common R code used within knitr-type reproducible research papers. It's still very much in the early stages of development and has two commands: LoadandCite:...

Read more »

R installation + screenshots

January 30, 2013
By
R installation + screenshots

Feeling faint of heart without photos depicting what to do? No worries, here they are. Go to the R website and click “Download R” under “Getting Started” Choose a place to download R. Even though we’re on the limitless and borderless interweb, choosing a location close to you helps speeds things up. Choose which R package to download based

Read more »

R users: Be counted in Rexer’s 2013 Data Miner Survey

January 30, 2013
By
R users: Be counted in Rexer’s 2013 Data Miner Survey

Since 2007, Rexer Analytics has been conducting periodic surveys to measure the analytic behaviors, views and preferences of data miners and analytic professionals. In the last survey, conducted in 2011, more than 1300 analysts shared information about the data analysis software packages they use. (The results of all Rexer surveys are available free to anyone who requests them.) In...

Read more »

RcppArmadillo 0.3.6.2

January 30, 2013
By

A new Armadillo version 3.6.2 came out yesterday, and the corresponding RcppArmadillo version is now on CRAN. Changes are mostky incremental: Changes in RcppArmadillo version 0.3.6.2 (2013-01-29) Upgraded to Armadillo release Version 3.6.2 ...

Read more »

Maximize Your Expectations!

January 30, 2013
By
Maximize Your Expectations!

A Problem A major problem in secondary data analysis is that you didn't get to decide what data was collected. Lets say you were interested in how many times a student has read the Twilight books). Specifically, you want to know how effective the ads for...

Read more »

F1Stats – Visually Comparing Qualifying and Grid Positions with Race Classification

January 30, 2013
By
F1Stats – Visually Comparing Qualifying and Grid Positions with Race Classification

Following the roundabout tour of F1Stats – A Prequel to Getting Started With Rank Correlations, here’s a walk through of my attempt to replicate the first part of A Tale of Two

Read more »

R finals

January 30, 2013
By
R finals

On the morning I returned from Varanasi and the ISBA meeting there, I had to give my R final exam (along with three of my colleagues in Paris-Dauphine). This year, the R course was completely in English, exam included, which means I can post it here as it may attract more interest than the French

Read more »

Modeling Residential Electricity Usage with R

January 30, 2013
By
Modeling Residential Electricity Usage with R

Wow, I can’t believe it has been 11 months since my last blog posting!  The next series of postings will be related to the retail energy field.  Residential power usage is satisfying to model as it can be forecast fairly accurately with the right inputs.  Partly as a consequence of deregulation there is now more data more available than...

Read more »

Regression on categorical variables

January 30, 2013
By
Regression on categorical variables

This morning, Stéphane asked me tricky question about extracting coefficients from a regression with categorical explanatory variates. More precisely, he asked me if it was possible to store the coefficients in a nice table, with information on the variable and the modality (those two information being in two different columns). Here is some code I did to produce the...

Read more »

Approaching the Zero Bound – Bonds

January 30, 2013
By
Approaching the Zero Bound – Bonds

As bonds approach the artificial zero bound, where do we go next especially after the record setting +30% in 2011?  The rolling 250-day total return has rarely gone negative since the inception of the Vanguard Funds VBMFX and VUSTX.  I am int...

Read more »

The magic empty bracket

January 30, 2013
By
The magic empty bracket

I have been working with R for some time now, but once in a while, basic functions catch my eye that I was not aware of… For some project I wanted to transform a correlation matrix into a covariance matrix. Now, since cor2cov does not exist, I thought about “reversing” the cov2cor function (stats:::cov2cor). Inside

Read more »

Speed up for loops in R

January 30, 2013
By

Are your for loops too slow in R ? Are loops that should take seconds actually taking hours ? As I found out recently, how you structure your code can make a huge difference in execution times. Fortunately making a few small changes to your code can speed up these loops by several orders of

Read more »

R’s range and loop behaviour: Zero, One, NULL

January 30, 2013
By

One of the most common pattern in programming languages is to ability to iterate over a given set (a vector usually) by using 'for' loops. In most modern scripting languages range operations is a build in data structure and trivial to use with 'for' lo...

Read more »

Building a package in RStudio is actually very easy

January 30, 2013
By
Building a package in RStudio is actually very easy

So, you’ve written some code and you use it routinely. Now you’ve written some code and you’d like to use version control to ensure that development continues in a robust fashion. You do that and you use Github or something so that not only are changes tracked, but the general public receives the benefit of

Read more »

The three-dots construct in R

January 30, 2013
By

There is a mechanism that allows variability in the arguments given to R functions.  Technically it is ellipsis, but more commonly called “…”, dots, dot-dot-dot or three-dots. Basics The three-dots allows: an arbitrary number and variety of arguments passing arguments on to other functions Arbitrary arguments The two prime cases are the c and list The post The...

Read more »

A shiny app to display the human body map dataset

January 30, 2013
By
A shiny app to display the human body map dataset

There was quite a lot of buzz around when the guys from Rstudio launched Shiny, a new web framework for R that promises to “make it super simple for R users like you to turn analyses into interactive web applications … Continue reading →

Read more »

Using Boost’s foreach macro

January 30, 2013
By
Using Boost’s foreach macro

Boost provides a macro, BOOST_FOREACH, that allows us to easily iterate over elements in a container, similar to what we might do in R with sapply. In particular, it frees us from having to deal with iterators as we do with std::for_each and std::transform. The macro is also compatible with the objects exposed by Rcpp. Side note: C++11 has introduced...

Read more »

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training

datasociety

http://www.eoda.de





ODSC

ODSC

CRC R books series





Six Sigma Online Training









Contact us if you wish to help support R-bloggers, and place your banner here.