82 search results for "Web Scraping"

roll calls, ideal points, 112th Congress

June 29, 2011
By
roll calls, ideal points, 112th Congress

Now that classes are over, I took a little time to update my scripts that update the analysis of Congressional roll calls in close to real time.   Links appear at the top of the blog.   As of about 15 minutes ago, we’re up to 77 non-unanimous roll calls in the 112th Senate.  

Read more »

Automating R Scripts on Amazon EC2

June 9, 2011
By
Automating R Scripts on Amazon EC2

Overview: How to setup R on an EC2 instance of Ubuntu 11.04 (Natty Narwhal) How to setup Apache Tomcat 6.0 web server and configuring it with basic authentication so that we can view our output from R on a password … Continue reading

Read more »

Friday fun projects

May 14, 2011
By
Friday fun projects

What’s a “Friday fun project”? It’s a small computing project, perfect for a Friday afternoon, which serves the dual purpose of (1) keeping your programming/data analysis skills sharp and (2) providing a mental break from the grind of your day job. Ideally, the skills learned on the project are useful and transferable to your work

Read more »

Further Adventures in Visualisation with ggplot2

April 25, 2011
By
Further Adventures in Visualisation with ggplot2

So I previously took a look at some data of player performance from a computer game. In this post, I’m going to do some further visualisations using ggplot2. The data consists of different types of player character, different roles for those characters, and their overall damage output (the unit here is damage per second, or

Read more »

Friday Function: setInternet2

April 15, 2011
By
Friday Function: setInternet2

Corporate IT networks are a pain for programmers. Ideally, when programming, you want the freedom to download, install and run any software that you want. Unfortunately, in the interests of security, many programmers find themselves a little restricted at the office. (I’m sure that many network admins will protest that the situation works both ways

Read more »

Find NHL Players with 30 Goals and 100 PIM using R

April 2, 2011
By
Find NHL Players with 30 Goals and 100 PIM using R

Last week Jack Edwards raised the fact that Milan Lucic was the first Bruin player to join the 30 Goal / 100 Penalty Minute club in a few years.  It got me thinking about the other players who have accomplished … Continue reading

Read more »

NBA Analysis: Coming Soon!

March 21, 2011
By
NBA Analysis:  Coming Soon!

I decided to spend a few hours this weekend writing the R code to scrape the individual statistics of NBA players (2010-11 only).  I originally planned to write up a few NBA-related analyses, but a friend was visiting from out … Continue reading

Read more »

R Screen Scraping: 105 Counties of Election Data

February 18, 2011
By

by Earl F. Glynn, Kansas Watchdog The goal of this article is to show how to visit 105 online web pages programmatically and “scrape” data from them to form a statewide summary of election data in Kansas. An earlier article gave details of ...

Read more »

Simple R Screen Scraping Example

February 18, 2011
By

by Earl F. Glynn, Kansas Watchdog The goal of this exercise is to show how to “screen scrape” data from an online web page using R. Additional articles will extend this example to scrape data from 105 Kansas county pages to form a statewide...

Read more »

Clustering NHL Skaters

February 6, 2011
By
Clustering NHL Skaters

I have been sitting on this post for some time now and wanted to get it out there.  The goal is to simply show how easy it is to pull live data from the web into R, massage it, and perform some analytics on it.  I am not sure how useful this analysis really is

Read more »