Posts Tagged ‘ Python ’

UCLA Statistics: Analyzing Thesis/Dissertation Lengths

September 29, 2010
By
UCLA Statistics: Analyzing Thesis/Dissertation Lengths

As I am working on my dissertation and piecing together a mess of notes, code and output, I am wondering to myself “how long is this thing supposed to be?” I am definitely not into this to win the prize for longest dissertation. I just want to say my piece, make my point and move on. I’ve heard that...

Read more »

Julien on R shortcomings

September 8, 2010
By
Julien on R shortcomings

Julien Cornebise posted a rather detailed set of comments (from Jasper!) that I thought was interesting and thought-provoking enough (!) to promote to a guest post. Here it is , then, to keep the debate rolling (with my only censoring being the removal of smileys!). (Please keep in mind that I do not endorse everything

Read more »

Using XML package vs. BeautifulSoup

August 31, 2010
By
Using XML package vs. BeautifulSoup

A while back I posted something about scraping a webpage using the BeautifulSoup module in Python.  One of the comments to that post was by Larry — a blogger over at IEORTools — suggesting that I take a look at the XML library in R.  Given that one of the points of this blog is

Read more »

Apologies and Style Guides

August 13, 2010
By
Apologies and Style Guides

I have to say that it’s pretty exciting to watch your blog go from a few hits over its lifetime to getting almost 200 in a single day.  I am currently negotiating with Google over the purchase of this blog.  Or maybe not.  Again, thanks be to @revodavid for posting to the Revolution Analytics Blog.

Read more »

Why R doesn’t suck

June 19, 2010
By

I first encountered the R programming language a few years ago when I needed to make some plots. Although I’ve used it occasionally since, I always considered it a sort of “Perl for statisticians” — a useful swiss-army knife with … Continue reading

Read more »

Manhattan plots for SNP marker effects using ggplot2

June 17, 2010
By
Manhattan plots for SNP marker effects using ggplot2

At AIPL, we’ve been posting Manhattan plots of the marker effects for each breed-trait combination with each official release of our genomic predictions. For example, consider the plot of lifetime net merit for Holsteins from the April, 2010 run: These … Continue reading

Read more »

Hitting the Big Data Ceiling in R

May 16, 2010
By
Hitting the Big Data Ceiling in R

As a true R fan, I like to believe that R can do anything, no matter how big, how small or how complicated: there is some way to do it in R. I decided to approach my large, sparse matrix problem with this attitude. But here I sit a broken man.

There is no “native” big data support built into...

Read more »

Connecting R and Python

May 7, 2010
By

There are a few ways to do this, but the only one that worked for me was to use Rserve and rconnect. In R, do this: 1 2 3 install.packages("Rserve") library(Rserve) Rserve(debug = FALSE, port=6311, args=NULL) Then you can connect in Python very easily. Here is a test in Python: 1 2 rcmd = pyRserve.rconnect(host='localhost', port=6311) print(rcmd('rnorm(100)'))

Read more »

Social Media Analytics Research Toolkit (SMART@znmeb) Is Moving Into Private Beta

March 31, 2010
By

Download "Getting Started with the Social Media Analytics Research Toolkit" (pdf, 1.25 megabytes) Download the Social Media Analytics Research Toolkit My Social Media Analytics Research Toolkit is about to move into private beta. What's in the release?...

Read more »

Be Careful Searching Python Dictionaries!

February 27, 2010
By
Be Careful Searching Python Dictionaries!

For my talk on High Performance Computing in R (which I had to reschedule due to a nasty stomach bug), I used Wikipedia linking data, an adjacency list of articles and the articles to which they link. This data was linked from DataWrangling and was originally created by Henry Haselgrove. The dataset is small on disk, but I needed...

Read more »