Articles by Christopher Bare

Lee Edlefsen on Big Data in R

December 3, 2014 | 0 Comments

Lee Edlefsen, Chief Scientist at Revolution Analytics, spoke about Big Data in R at the FHCRC a week or two back. He introduced the PEMA or parallel external memory algorithm. “Parallel external memory algorithms (PEMA's) allow solution of both ... [Read more...]

Linear Models

February 7, 2014 | 0 Comments

<!--

[social4i size="small" align="align-left"] --<div style="border: 1px solid; background: none [Read more...]

Online class on Statistical Learning

January 24, 2014 | 0 Comments

Trevor Hastie and Robert Tibshirani are teaching an online class on Statistical Learning starting this week. The first week is introduction and overview, so it's not too late to join up. They've also published a new book, An Introduction to Statistical Learning, as a more accessible companion to their widely ... [Read more...]

Generate UUIDs in R

July 11, 2013 | 0 Comments

Here a snippet of R to generate a Version 4 UUID. Dunno why there wouldn't be an official function for that in the standard libraries, but if there is, I couldn't find it. ## Version 4 UUIDs have the form: ## xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx ## where x is any hexadecimal digit and ## y is one of 8, 9, ... [Read more...]

The Dream 8 Challenges

June 25, 2013 | 0 Comments

The 8th iteration of the DREAM Challenges are underway. DREAM is something like the Kaggle of computational biology with an open science bent. Participating teams apply machine learning and statistical modeling methods to biological problems, competing to achieve the best predictive accuracy. This year's three challenges focus on reverse engineering ... [Read more...]

Shiny talk by Joe Cheng

May 29, 2013 | 0 Comments

Shiny is a framework work for creating web applications with R. Joe Cheng of RStudio, Inc. presented on Shiny last evening in Zillow's offices 30 stories up in the former WaMu Center. Luckily, the talk was interesting enough to compete with the view ... [Read more...]

Data analysis class

February 7, 2013 | 0 Comments

I've been writing software to help others do data analysis for a number of years and at the same time trying to work up my nerve to try my own analysis. Why let other people have all the fun? So, when I saw that Jeffrey Leek, biostatistician at Johns Hopkins ... [Read more...]

R in the Cloud

December 6, 2012 | 0 Comments

I've been having some great fun parallelizing R code on Amazon's cloud. Now that things are chugging away nicely, it's time to document my foibles so I can remember not to fall into the same pits of despair again. The goal was to perform lots of trails of a randomized ... [Read more...]

Computing kook density in R

September 24, 2012 | 0 Comments

Do you ever see strange lights in the sky? Do you wonder what really goes on in Area 51? Would you like to use your R hacking skills to get to the bottom of the whole UFO conspiracy? Of course, you would! UFO data from infochimps is the focus of a ... [Read more...]

OO in R

September 13, 2012 | 0 Comments

"Is there a package for obfuscating code in #rstats?", someone asked. "The S4 object system?!" came the snarky reply. If you're smiling right now, you know that it wouldn't be funny if it weren't at least a little bit true. Options: S3, S4 or R5? There can be little doubt ...
[Read more...]

Linear regression by gradient descent

July 26, 2012 | 0 Comments

In Andrew Ng's Machine Learning class, the first section demonstrates gradient descent by using it on a familiar problem, that of fitting a linear function to data. Let's start off, by generating some bogus data with known characteristics. Let's make y just a noisy version of x. Let's also add 3 ... [Read more...]

Long-vector kludge in R

July 25, 2012 | 0 Comments

Just recently, I found out that R is limited to 32-bit integers, even on 64-bit hardware. Bummer, huh? As a consequence, the maximum size of a vector is 2^31-1. To be fair, dealing with numeric types across machine architectures is hard. A fixed repr...
[Read more...]

Sage Bionetworks Synapse

April 27, 2012 | 0 Comments

Michael Kellen, Director of Technology at Sage Bionetworks, is trying to build a GitHub for science. It's called Synapse and Kellen described it in a talk at the Sage Bionetworks Commons Congress 2012, this past weekend: 'Synapse' Pilot for Building an... [Read more...]

International Open Data Hackathon

December 5, 2011 | 0 Comments

This past Saturday, I hung out at the Seattle branch of the International Open Data Hackathon. The event was hosted at the Pioneer Square office of Socrata, a small company that helps governments provide public open data. A pair of data analysts from Tableau were showing off a visualization for ... [Read more...]

Hipster programming languages

September 26, 2011 | 0 Comments

If you look at the programming languages that are popular these days, a few patterns emerge. I'm not talking about languages that have the most hits on the job sites. I'm talking about what the cool kids are coding in - the folks that hang out on hacke... [Read more...]

String functions in R

August 25, 2011 | 0 Comments

Here's a quick cheat-sheet on string manipulation functions in R, mostly cribbed from Quick-R's list of String Functions with a few additional links. substr(x, start=n1, stop=n2) grep(pattern,x, value=FALSE, ignore.case=FALSE, fixed=FALSE) gsub(pattern, replacement, x, ignore.case=FALSE, fixed=FALSE) gregexpr(pattern, ... [Read more...]

MySQL and R

August 15, 2011 | 0 Comments

Using MySQL with R is pretty easy, with RMySQL. Here are a few notes to keep me straight on a few things I always get snagged on. Typically, most folks are going to want to analyze data that's already in a MySQL database. Being a little bass-ackwards, I often want ... [Read more...]
1 2

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)