Posts Tagged ‘ analytics ’

Visualising Activity Around a Twitter Hashtag or Search Term Using R

February 6, 2012
By
Visualising Activity Around a Twitter Hashtag or Search Term Using R

I think one of valid criticisms around a lot of the visualisations I post here and on my various #f1datajunkie blogs is that I often don’t post any explanatory context around the visualisations. This is partly a result of the way I use my blog posts in a selfish way to document the evolution...

Read more »

Presidents in Twitter

January 5, 2012
By
Presidents in Twitter

I saw the release of a new version of twitteR package a few weeks back and thought I should be testing the code I wrote some time ago but also do something interesting at the same time. Thus I came up with the idea of checking out how Presidents are do...

Read more »

Visualization of Prosper.com’s Loan Data Part I of II – Compare and Contrast with Lending Club

December 6, 2011
By
Visualization of Prosper.com’s Loan Data Part I of II – Compare and Contrast with Lending Club

Due to the positive feedback received on this post I thought I would re-create the analysis on another peer-to-peer lending dataset, courtesy of Prosper.com. You can access the Prosper Marketplace data via an API or by simply downloading XML files that are updated nightly http://www.prosper.com/tools/. If you are going to follow the route I...

Read more »

Mining Lending Club’s Goldmine of Loan Data Part I of II – Visualizations by State.

October 14, 2011
By
Mining Lending Club’s Goldmine of Loan Data Part I of II – Visualizations by State.

I have a few friends that keep bragging about their 14% annual returns by investing their money with Lending Club, a peer-to-peer lending service that cuts out the complexities and difficulties of getting approved for a loan through a bank. To give you an idea of the sheer amount of volume Lending Club has...

Read more »

Programmers Should Know R

August 6, 2011
By
Programmers Should Know R

Programmers should definitely know how to use R. I don’t mean they should switch from their current language to R, but they should think of R as a handy tool during development.Again and again I find myself working with Java code like the following. td.linenos { background-color: #f0f0f0; padding-right: 10px; } span.lineno { background-color:...

Read more »

hacking .gov shortened links

July 30, 2011
By
hacking .gov shortened links

This past Friday, the web portal to the US Federal government, USA.gov, organized hackathons across the US for programmers and data scientists to work with and analyze the data from their link-shortening service. It turns out that if you shorten a web link with bit.ly, the shortened link looks like 1.usa.gov/V6NpL (that one goes...

Read more »

Notes on Engineering Data Analysis (with R and ggplot2)

July 8, 2011
By
Notes on Engineering Data Analysis (with R and ggplot2)

Hadley Wickham gave a Google Tech Talk a couple weeks back titled Engineering Data Analysis (with R and ggplot2). These are my notes. The data analysis cycle is to iteratively transform, visualize and model. Leading into the cycle is data access an...

Read more »

Drawing heatmaps in R

June 24, 2011
By
Drawing heatmaps in R

A while back, while reading chapter 4 of Using R for Introductory Statistics, I fooled around with the mtcars dataset giving mechanical and performance properties of cars from the early 70's. Let's plot this data as a hierarchically clustered heatmap...

Read more »

making meat shares more efficient with R and Symphony

May 9, 2011
By
making meat shares more efficient with R and Symphony

In my previous post, I motivated a web application that would allow small-scale sustainable meat producers to sell directly to consumers using a meat share approach, using constrained optimization techniques to maximize utility for everyone involved. In this post, I’ll walk through some R code that I wrote to demonstrate the technique on a...

Read more »

intuitive visualizations of categorization for non-technical audiences

April 25, 2011
By
intuitive visualizations of categorization for non-technical audiences

For a project I’m working on at work, I’m building a predictive model that categorizes something (I can’t tell you what) into two bins. There is a default bin that 95% of the things belong to and a bin that the business cares a lot about, containing 5% of the things. Some readers may...

Read more »