Blog Archives

Trying to reduce the memory overhead when using mclapply

November 14, 2013
By
Trying to reduce the memory overhead when using mclapply

I am currently trying to understand how to reduce the memory used by mclapply. This function is rather complicated and others have explained the differences versus parLapply (A_Skelton73, 2013; Read more »

Creating your Jekyll-Bootstrap powered blog for R blogging

November 9, 2013
By
Creating your Jekyll-Bootstrap powered blog for R blogging

As you might have noticed, I recently decided to move Fellgernon Bit from Tumblr to GitHub. There are a couple of reasons why I made this change. I wanted a more professional-looking blog. There are not many R blogs on Tumblr, and well, long text posts are not really meant for Tumblr. Better code highlighting. I had enabled R code highlighting...

Read more »

ggplot Tutorial

June 21, 2013
By
ggplot Tutorial

ggplot Tutorial I liked the following ggplot2 tutorial which is featured in Gabriela de Queiroz’s blog called unbiasedestimator. The tutorial looks very neatly presented and I’m sure that it will be very helpful to anyone just getting started with ggplot2 before they jump into ggplot2: Elegant Graphics for Data Analysis by Hadley Wickham or R Graphics Cookbook by...

Read more »

userR2013 data analysis contest: data exploration

June 12, 2013
By
userR2013 data analysis contest: data exploration

Description The useR2013 conference is organizing a data analysis contest, check the rules here. They have a package called useR2013DAC with two data sets: one from La Liga and the other one from the Formula 1. Once you download and install the package (available here), you can quickly explore the data using the following R commands: Data exploration ## Load the...

Read more »

Reading an R file from GitHub

Reading an R file from GitHub

Lets say that I want to read in this R file from GitHub into R. The first thing you have to do is locate the raw file. You can do so by clicking on the Raw button in GitHub. In this case it’s https://raw.github.com/lcolladotor/ballgownR-devel/master/ballgownR/R/infoGene.R One would think that using source() would work, but it doesn’t as shown below: source("https://raw.github.com/lcolladotor/ballgownR-devel/master/ballgownR/R/infoGene.R") ##...

Read more »

Using plyr and doMC for quick and easy apply-family functions

April 26, 2013
By
Using plyr and doMC for quick and easy apply-family functions

A few weeks back I dedicated a short amount of time to actually read what plyr (Wickham, 2011) is about and I was surprised. The whole idea behind plyr is very simple: expand the apply() family to do things easy. plyr has...

Read more »

Predicting who will win a NFL match at half time

March 23, 2013
By
Predicting who will win a NFL match at half time

It was great to have a little break, Spring break, although the weather didn’t feel like spring at all! During the early part of the break I worked on my final project for Jeff Leek’s data analysis class, which we call 140.753 here. Continuing my previous posts on the topic, this time I’ll share the results of my...

Read more »

And so begins English Composition I

March 21, 2013
By
And so begins English Composition I

This week started the English Composition I: Achieving Expertise course (Comer, 2013) that I have been looking forward to. I am not sure yet how long I will last, but I hope to enjoy it as much as I can. Plus, it should help me with my...

Read more »

FBit: GitHub repo for posts with R code for this blog

March 11, 2013
By
FBit: GitHub repo for posts with R code for this blog

This is a test post since I want to improve upon Jeffrey Horner’s strategy for posting R code in Tumblr. The only minor improvement I wanted to try out is hosting the images directly on the web. I mean, right now the images won’t show in RSS readers. I’m not doing anything new at all, just using the...

Read more »

Analyzing SimplyStatistics visits info

March 9, 2013
By
Analyzing SimplyStatistics visits info

Recently we had to analyze the data of the number of visits per day to SimplyStatistics.org. There were two goals: Estimate the fraction of visitors retained after a spike in the number of visitors Identify (if any) any factors that influence the fraction estimated in 1. For me it was a fun project in part because I like SimplyStatistics but also...

Read more »