129 search results for "web scraping"

R and foreign characters

January 25, 2013
By
R and foreign characters

Working with Russian characters can be mind-numbingly frustrating. This is true for R, as for other applications, so below I've written out the my top five tricks for making Russian inputs work in R; i believe they should be transferable to most other languages....

Read more »

SPARQL with R in less than 5 minutes

January 23, 2013
By
SPARQL with R in less than 5 minutes

In this article we’ll get up and running on the Semantic Web in less than 5 minutes using SPARQL with R. We’ll begin with a brief introduction to the Semantic Web then cover some simple steps for downloading and analyzing government data via a SPARQL query with the SPARQL R package. What is the Semantic The post SPARQL...

Read more »

Multiple Classification and Authorship of the Hebrew Bible

January 1, 2013
By
Multiple Classification and Authorship of the Hebrew Bible

Sitting in my synagogue this past Saturday, I started thinking about the authorship analysis that I did using function word counts from texts authored by Shakespeare, Austen, etc.  I started to wonder if I could do something similar with the … Continue reading →

Read more »

Chocolate and nobel prize – a true story?

December 22, 2012
By
Chocolate and nobel prize – a true story?

Few of us can resist chocolate, but the real question is: should we even try to resist it? The image is CC by Tasumi1968. As a dark chocolate addict I was relieved to see Messerli's ecological study on chocolate consumption and the...

Read more »

Animated map of 2012 US election campaigning, with R and ffmpeg

October 28, 2012
By

(Video link here, in case the embedded player doesn’t work for you.) Idea: see if I can mimic the idea behind Ben Schmidt’s lovely video of ocean shipping routes, and apply it to another dataset. But which? “Hmm… what’s another … Continue reading →

Read more »

Tips on accessing data from various sources with R

October 3, 2012
By

Jeffrey Breen (the man behind the Twitter airline sentiment analysis example) recently posted a collection of slides with some great tips for accessing data from R. "Tapping the Data Deluge" includes information on: Using the XLConnect package to read data from Excel spreadsheets Using the foreign package to read SPSS, SAS, Stata and dBase data files Using SQL queries...

Read more »

R Helper Functions

September 25, 2012
By
R Helper Functions

If you do a lot of R programming, you probably have a list of R helper functions set aside in a script that you include on R startup or at the top of your code. In some cases helper functions add capabilities that aren’t otherwise available. In other cases, they replicate functionality that is available The post R...

Read more »

The R-Podcast Episode 10: Adventures in Data Munging Part 2

September 16, 2012
By

I’m happy to present episode 10 of the R-Podcast! Season 1 of the R-Podcast concludes with part 2 of my series on data munging, in which I discuss issues surrounding importing data sets contained in HTML tables. I share how I used the XML and RCurl packages to validate and import data from hockey-reference.com for

Read more »

UseR 2012 highlights

June 20, 2012
By
UseR 2012 highlights

The eighth annual R user conference, UseR! 2012, has come and gone — and what an event it was! I've been to five useR! conferences so far, and each one improves upon the last. This year's conference at Vanderbilt was the best so far: an outstanding location (my first visit to Nashville, a great city), excellent facilities (the lecture...

Read more »

Visualizing the CRAN: Graphing Package Dependencies

May 17, 2012
By
Visualizing the CRAN:  Graphing Package Dependencies

I had been meaning to start toying with the igraph package for a while. So a few weeks ago (lay off, I'm busy), I decided to grab a bunch of CRAN data about package dependencies. The easiest way that I could think to get this information was to just grab the html files for all the package descriptions and...

Read more »