## Changes in optimization performance of gcc over time

September 16, 2012
By

The SPEC benchmarks came out a year after the first release of gcc (in fact gcc was and still is one of the programs included in the benchmark). Compiling the SPEC programs using the gcc option -O2 (sometimes -O3) has always been the way to measure gcc performance, but after 25 years does this way

## The R-Podcast Episode 10: Adventures in Data Munging Part 2

September 16, 2012
By

I’m happy to present episode 10 of the R-Podcast! Season 1 of the R-Podcast concludes with part 2 of my series on data munging, in which I discuss issues surrounding importing data sets contained in HTML tables. I share how I used the XML and RCurl packages to validate and import data from hockey-reference.com for

## What’s the smallest amount you can’t make with 5 coins ?

September 16, 2012
By

My amazing, awesome wife often comes up with the little puzzles for our amazing children, and this one seemed destined to be solved in R. So, using up to 5 coins (1p, 2p, 5p, 10p, 20p and 50p) first she asked our kids whether they could make every val...

## New version of devtools: 0.8

September 16, 2012
By

We’re pleased to announce a new version of devtools, the package that makes R package development easy. The main features in this version are: A complete rewrite of the code loading system which simulates namespace loading much more accurately – this means using load_all is much closer to installing and loading the package. It also

## Confidence Regions for Regression Coefficients

September 16, 2012
By

Let’s consider the usual linear regression model, with the full set of assumptions:                     y = Xβ + ε ;    ε ~ N , (1)where X is a non-random (n × k) matrix with full column rank.Recall that, under our usual set of assumptions...

## Confidence Regions for Regression Coefficients

September 16, 2012
By

Let’s consider the usual linear regression model, with the full set of assumptions:                     y = Xβ + ε ;    ε ~ N , (1)where X is a non-random (n × k) mat...

## Football model

September 16, 2012
By

After reading Dutch football data (Eeredivisie 2011-2012) and making a predictions display it is time to look at a few simple models to predict goals. To reiterate the data setup, each game played consists of two rows in the data frame. ...

## World Cup 2006 First Goal R Analysis

September 16, 2012
By

Quite a while ago my amazing wife asked me if it was possible to find the time of the first goal for the 2006 FIFA World Cup matches.  I was using R at the time and thought it was possible.  Here are the scripts I wrote to scrape the info fro...

## California High School Graduation and Dropout Rates

September 16, 2012
By

Abstract The California Deparment of Education recently (June 2012) had a news release on the increase in high school (grades 9-12) graduation rates and decrease in dropout rates. The data used by the Department was from two cohorts (4-year periods) o...

## project-euler–problem 65

September 16, 2012
By

The square root of 2 can be written as an infinite continued fraction. $$\sqrt{2} = 1+\frac{1}{2+\frac{1}{2+\frac{1}{2+\frac{1}{2+?}}}}$$ \sqrt{2} = 1+\frac{1}{2+\frac{1}{2+\frac{1}{2+\frac{1}{2+?}}}} The infinite continued fraction can be written, √2 = , (2) indicates that 2 repeats ad infinitum. In a similar way, √23 = . Read More: 1030 Words Totally

September 15, 2012
By

So tonight I wanted to download all my Facebook pictures. For some reason the zip file was corrupted each of the 3 times I downloaded, so I remembered that some time ago I was playing around with an R project named Facebook Data-Mining. The project is conveniently located at github and you can access it here. Looking

September 15, 2012
By

So tonight I wanted to download all my Facebook pictures. For some reason the zip file was corrupted each of the 3 times I downloaded, so I remembered that some time ago I was playing around with an R project named Facebook Data-Mining. The project is conveniently located at github and you can access it here. Looking ...read more

## Preferential attachment for network

September 15, 2012
By

I am currently taking the networked life course on Coursera.org offered by Professor Michael Kearns from the University of Pennsylvania.  I have been took several courses including machine learning, natural language processing since the platf...

## N-Way ANOVA

September 15, 2012
By

N-Way ANOVA example Two-way analysis of variance is where the rubber hits the road, so to speak. This extends the concepts of ANOVA with only one factor to two factors. When there are two factors this means that there can be an interaction between the two factors that should be tested. As one might expect

## An implementation of the Newton-Raphson algorithm in C/C++ and R

September 14, 2012
By

Today, we write a small piece of C/C++ code that implements the well-known Newton-Raphson algorithm (see, Mathworld). We also provide the R code. Exercise: Find the unique root of the function  using the Newton-Raphson method. Notice that we choose a function … Continue reading →

## Slightly-more-than-basic sentiment analysis

September 14, 2012
By

I became interested in sentiment analysis a few months ago as a matter of pure practicality. The company I work for does a lot of customer-satisfaction surveys. Respondents rate various aspects of our products, but they also have the opportunity to answer a bunch of open-ended questions in their own voices. That kind of information

## Getting into R, RCommander, JGR and Deducer

September 14, 2012
By

I've been meaning to post something about R for a while, but never got started, and now have a pile of things I'd like to post, so it's time to get started. I first started using R during my Master Dissertation where I had to do some stats.  I've ...

## Visualize complex data with subplots

September 14, 2012
By

Today's guest post comes from Garrett Grolemund, a software developer at RStudio — ed. I think of graphs as a type of visual summary for data. Yet I rarely see graphs used this way within visualizations. Consider tile plots. They group data into 2d bins and then summarize each group with a number. This approach is a go-to tool...

## Simulation metamodeling with constraints

September 14, 2012
By

Last week I have posted about using simulation metamodeling to verify results of analytical solution of the model. After posting it I realized that the solution presented there can be improved by using knowledge of simulation model struc...

## Mapping Bike Accidents in R

September 14, 2012
By

At last weekend’s Hack Ta Ville event here in Montreal, I joined up with some talented urban planners and web devs to realize Vélobstacles. The idea of the project is to crowd source information on cycling conditions around the city. As with any crowd sourcing project, we were faced with the problem of seeding the

## Great Circles, Black Holes, and Community Events Part 3 of 3

September 14, 2012
By

The second community event is the Soldier Hollow Junior Olympics (SoHo), again found in the Heber Valley area. Building upon the previous posts (part 1 and part 2) this one will show an event that has more people coming from greater distance. Take the bar charts for the number of participants and the cities they are...

## How-to: Construct petridish plots in R

September 14, 2012
By

Script for petridish layout in R 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 library(igraph)   # create empty graph g <- graph.empty(directed=FALSE) node_id <- c(1:1000)   # … Continue reading →

## Mid-September flotsam

September 14, 2012
By

This is one of those times of the year: struggling to keep the head above the water, roughly one month before the last lecture of the semester. On top trying to squeeze trips, meetings and presentations in between while dealing … Continue reading →

## New book: “Modeling Psychophysical Data in R”

September 14, 2012
By

Ken Knoblauch wrote to inform me that Springer has just released a book he coauthored with Larry Maloney on statistical methods in psychophysics. The book is called “Modeling Psychophysical Data in R” and covers both classical psychophysical analyses (Signal Detection Theory) and more recent methods (e.g. Mixed Models). Ken was one of the first in

September 14, 2012
By

China just announced the Diaoyu Islands baselines yesterday (US EST). Take a look at their locations (Clike here for the google map).

## OO in R

September 13, 2012
By

"Is there a package for obfuscating code in #rstats?", someone asked. "The S4 object system?!" came the snarky reply. If you're smiling right now, you know that it wouldn't be funny if it weren't at least a little bit true. Options: S3, S4 or R5? There can be little doubt that object oriented...

## Improved net stacked distribution graphs via ggplot2 trickery

September 13, 2012
By

Net stacked distribution graphs are a nice way of comparing data on a Likert scale. It strips out the neutral responses and centers the responses around the center of the graph so you can quickly compare agreement and disagreement on different issues. Here we'll build on Jason Becker's work on doing this in ggplot2 -- it requires...

## A function to find the “Penultimax”

September 13, 2012
By

Penulti-what?  Let me explain: Today I had to iteratively go through each row of a donor history dataset and compare a donor’s maximum yearly donation total to the second highest yearly donation total.  In even more concrete terms, for each … Continue reading →