EC2 AMI for scientific computing in Python and R

April 11, 2011 | Drew Conway

Like many people who crunch numbers frequently, I have increasingly been integrating Amazon’s cloud computing services into my daily workflow. In particular, I have been using their elastic cloud computing (EC2) on a regular basis. The service is an excellent way to offload computationally intensive work from your laptop ... [Read more...]

Parallel computation [revised]

March 14, 2011 | xi'an

We have now completed our revision of the parallel computation paper and hope to send it to JCGS within a few days. As seen on the arXiv version, and given the very positive reviews we received, the changes are minor, mostly focusing on the explanation of the principle and on ...
[Read more...]

A quick look at #march11 / #saudi tweets

March 12, 2011 | mjbommar

Well, so much for that #march11 #Saudi day of rage.  Whether it was really the "tempest in a teacup" that  Prince Al-Waleed suggested on CNBC (video below, transcript here) or not, the oil complex and Saudi markets seem to have shrugged … Continue reading → [Read more...]

Software tools for data analysis – an overview

February 19, 2011 | Szilard

by Szilard Pafka Discussions on various software tools (C, C++, Perl, Python, Unix shell, R, Matlab, SAS, SPSS, Excel, databases, Hadoop etc.) used in data analysis. Szilard Pafka (founder and co-organizer of the Los Angeles R users group) presents an … Continue reading →
[Read more...]

R Bloggers: The Site I Wish Existed in 2007

February 19, 2011 | mjbommar

  My first experience with R was in 2007 as a sophomore in undergrad.  As part of a larger project on pricing day-ahead electricity futures, I wanted to cluster locational marginal price (LMP) data from the ISO-NE.  Something like k-means is easy … Continue reading → [Read more...]

Pre-processing text: R/tm vs. python/NLTK

February 16, 2011 | mjbommar

  Let’s say that you want to take a set of documents and apply a computational linguistic technique.  If your method is based on the bag-of-words model, you probably need to pre-process these documents first by segmenting, tokenizing, stripping, stopwording, and … Continue reading → [Read more...]

Parsing and plotting time series data

January 15, 2011 | csgillespie

This morning I came across a post which discusses the differences between scala, ruby and python when trying to analyse time series data. Essentially, there is a text file consisting of times in the format HH:MM and we want to get an idea of its distribution. Tom discusses how ...
[Read more...]

Julien on R shortcomings

September 8, 2010 | xi'an

Julien Cornebise posted a rather detailed set of comments (from Jasper!) that I thought was interesting and thought-provoking enough (!) to promote to a guest post. Here it is , then, to keep the debate rolling (with my only censoring being the removal of smileys!). (Please keep in mind that I do ... [Read more...]

Apologies and Style Guides

August 13, 2010 | Ryan

I have to say that it’s pretty exciting to watch your blog go from a few hits over its lifetime to getting almost 200 in a single day.  I am currently negotiating with Google over the purchase of this blog.  Or maybe not.  Again, thanks be to @revodavid for posting ... [Read more...]

Why R doesn’t suck

June 19, 2010 | Paul Butler

I first encountered the R programming language a few years ago when I needed to make some plots. Although I’ve used it occasionally since, I always considered it a sort of “Perl for statisticians” — a useful swiss-army knife with … Continue reading → [Read more...]

Connecting R and Python

May 7, 2010 | Matt Asher

There are a few ways to do this, but the only one that worked for me was to use Rserve and rconnect. In R, do this: 1 2 3 install.packages("Rserve") library(Rserve) Rserve(debug = FALSE, port=6311, args=NULL) Then you can connect in Python very easily. Here is a test in ... [Read more...]

Be Careful Searching Python Dictionaries!

February 27, 2010 | Ryan

For my talk on High Performance Computing in R (which I had to reschedule due to a nasty stomach bug), I used Wikipedia linking data, an adjacency list of articles and the articles to which they link. This data was linked from DataWrangling and was originally created by Henry Haselgrove. ... [Read more...]

Python in Sweave document

February 9, 2010 | Matti Pastell

Lately I have been using a lot of Python for signal processing and I quite like SciPy. However, I have been missing something like Sweave, which is great literate programming environment for R. Today I managed to look a bit more into it and found this hack on how to ... [Read more...]

Eight R Video Tutorials on VCASMO

February 4, 2010 | Ed Borasky

Download "Getting Started with the Social Media Analytics Research Toolkit" (pdf, 1.25 megabytes) Download the Social Media Analytics Research Toolkit Thanks to Drew Conway (@drewconway), a PhD student at New York University, there are now eight excell... [Read more...]

Top Five Open Source Projects of 2009

November 5, 2009 | Ed Borasky

Every year, I single out what I think are the Top Five open source projects. This year, there's only one hold-over from previous years, and it's likely that I'm just going to give it a Lifetime Achievement Award and pick five others next year. 5. NetBe... [Read more...]

R String processing

July 2, 2009 | Chris

Here's a little vignette of data munging using the regular expression facilities of R (aka the R-project for statistical computing). Let's say I have a vector of strings that looks like this:__ coords [1] "chromosome+:157470-158370" "chromosome+:1583...
[Read more...]
1 2 3

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)