Blog Archives

Scheduling R Tasks with Crontabs to Conserve Memory

September 3, 2013
By
Scheduling R Tasks with Crontabs to Conserve Memory

One of R’s biggest pitfalls is that eats up memory without letting it go.  This can be a huge problem if you are running really big jobs, have a lot of tasks  to run, or there are multiple users on your local computer or r server.  When I run huge jobs on my mac, I

Read more »

Heatmapping Washington, DC Rental Price Changes using OpenStreetMaps

August 4, 2013
By
Heatmapping Washington, DC Rental Price Changes using OpenStreetMaps

Percentage change of median price per square foot from July 2012 to July 2013: Percentage change of median price from July 2012 to July 2013: Last November I made a  choropleth of median rental prices in the San Francisco Bay Area using data from my company, Kwelia.  I have wanted to figure out how to

Read more »

Getting started with twitteR in R

June 13, 2013
By
Getting started with twitteR in R

I have asked by a few people lately to help walk them through using twitter API in R, and I’ve always just directed them to the blog post I wrote last year during the US presidential debates not knowing that Twitter had changed a few things. Having my interest peaked through a potential project at

Read more »

Tapping the FourSquare Trending Venues API with R

March 4, 2013
By
Tapping the FourSquare Trending Venues API with R

I came up with the following function to tap into the FourSquare trending venues API: library("RCurl", "RJSONIO")   foursquare<-function(x,y,z){ w<-paste("https://api.foursquare.com/v2/venues/trending?ll=",x,"&radius=2000&oauth_token=",y,"&v=",z,sep="") u<-getURL(w) test<-fromJSON(u) locationname="" lat="" long="" zip="" herenowcount="" likes="" for(n in 1:length(test$response$venues)) { locationname = test$response$venues]$name lat = test$response$venues]$location$lat long = test$response$venues]$location$lng zip = test$response$venues]$location$postalCode herenowcount<-test$response$venues]$hereNow$count likes<-test$response$venues]$likes$count xb<-as.data.frame(cbind(locationname, lat, long, zip, herenowcount, likes)) } xb$pulled=date() return(xb)

Read more »

UPDATE Multiple postgreSQL Table Records in Parellel

February 27, 2013
By
UPDATE Multiple postgreSQL Table Records in Parellel

Unfortunately the RpostgreSQL package (I’m pretty sure other SQL DBs as well) doesn’t have a provision to UPDATE multiple records (say a whole data.frame) at once or allow placeholders making the UPDATE a one row at a time ordeal, so I built a work around hack to do the job in parellel.  The big problem

Read more »

Opening Large CSV Files in R

December 26, 2012
By
Opening Large CSV Files in R

Before heading home for the holidays, I had a large data set (1.6 GB with over 1.25 million rows) with columns of text and integers ripped out of the company (Kwelia) Database and put into a .csv file since I was going to be offline a lot over the break. I tried opening the csv file

Read more »

Mapping Current Average Price Per Sqft for Rentals by Zip in San Fran

November 25, 2012
By
Mapping Current Average Price Per Sqft for Rentals by Zip in San Fran

My company, Kwelia, is sitting on mountains of data, so I decided to try my hand at mapping.  I have played around with JGR but it’s just too buggy, at least on my mac, so I went looking for other alternatives and found a good write up here.  I decided on mapping prices per sqft

Read more »

Building a Simple Web App using R

November 13, 2012
By
Building a Simple Web App using R

I’ve been interested in building a web app using R for a while, but never put any time into it until I was informed of the Shiny package.  It looked too easy, so I absolutely had to try it out. First you need to install the package from the command line . options(repos=c(RStudio="http://rstudio.org/_packages", getOption("repos"))) install.packages("shiny")

Read more »

Top Facebook Posts During the US Presidential Debate

October 22, 2012
By
Top Facebook Posts During the US Presidential Debate

The following data was collected during the Presidential Debate on the 22nd of October by tapping into the Facebook social graph API using R. The top three posted links during the debate for each candidate are: Obama- #1     http://bit.ly/QCODJg #2     http://bit.ly/RXstnm #3    http://bit.ly/P8MmJ1 Romney- #1    http://bit.ly/zDdsKf #2    http://bit.ly/SjFbKx

Read more »

Twitter Analysis of the US Presidential Debate

October 17, 2012
By
Twitter Analysis of the US Presidential Debate

The following are word clouds of tweets for each candidate from the October 16, 2012 debate with the bigger words the more often they were used in tweets (click on each word cloud to enlarge): And the net-negative posts for each candidate: Please note that the bigger the word is in the word cloud the

Read more »