Posts Tagged ‘ Python ’

The foundations of Statistics: a simulation-based approach

July 11, 2011
By
The foundations of Statistics: a simulation-based approach

“We have seen that a perfect correlation is perfectly linear, so an imperfect correlation will be `imperfectly linear’.” page 128 This book has been written by two linguists, Shravan Vasishth and Michael Broe, in order to teach statistics “in  areas that are traditionally not mathematically demanding” at a deeper level than traditional textbooks “without using

Read more »

EC2 AMI for scientific computing in Python and R

April 11, 2011
By

Like many people who crunch numbers frequently, I have increasingly been integrating Amazon’s cloud computing services into my daily workflow. In particular, I have been using their elastic cloud computing (EC2) on a regular basis. The service is an excellent way to offload computationally intensive work from your laptop for literally pennies on the

Read more »

Parallel computation [revised]

March 14, 2011
By
Parallel computation [revised]

We have now completed our revision of the parallel computation paper and hope to send it to JCGS within a few days. As seen on the arXiv version, and given the very positive reviews we received, the changes are minor, mostly focusing on the explanation of the principle and on the argument that it comes

Read more »

A quick look at #march11 / #saudi tweets

March 12, 2011
By
A quick look at #march11 / #saudi tweets

Well, so much for that #march11 #Saudi day of rage.  Whether it was really the "tempest in a teacup" that  Prince Al-Waleed suggested on CNBC (video below, transcript here) or not, the oil complex and Saudi markets seem to have shrugged … Continue reading →

Read more »

Software tools for data analysis – an overview

February 19, 2011
By
Software tools for data analysis – an overview

by Szilard Pafka Discussions on various software tools (C, C++, Perl, Python, Unix shell, R, Matlab, SAS, SPSS, Excel, databases, Hadoop etc.) used in data analysis. Szilard Pafka (founder and co-organizer of the Los Angeles R users group) presents an … Continue reading →

Read more »

R Bloggers: The Site I Wish Existed in 2007

February 19, 2011
By
R Bloggers: The Site I Wish Existed in 2007

  My first experience with R was in 2007 as a sophomore in undergrad.  As part of a larger project on pricing day-ahead electricity futures, I wanted to cluster locational marginal price (LMP) data from the ISO-NE.  Something like k-means is easy … Continue reading →

Read more »

Pre-processing text: R/tm vs. python/NLTK

February 16, 2011
By
Pre-processing text: R/tm vs. python/NLTK

  Let’s say that you want to take a set of documents and apply a computational linguistic technique.  If your method is based on the bag-of-words model, you probably need to pre-process these documents first by segmenting, tokenizing, stripping, stopwording, and … Continue reading →

Read more »

Parsing and plotting time series data

January 15, 2011
By
Parsing and plotting time series data

This morning I came across a post which discusses the differences between scala, ruby and python when trying to analyse time series data. Essentially, there is a text file consisting of times in the format HH:MM and we want to get an idea of its distribution. Tom discusses how this would be a bit clunky

Read more »

R User Groups 2010-10-25 21:14:50

October 25, 2010
By
R User Groups 2010-10-25 21:14:50

Videos from the October meeting “Text Mining with R” of the Los Angeles R users group:Rob Zinkov, “Text Mining with R”:Ryan Rosario, “Accessing R from Python using RPy2″:

Read more »

Julien on R shortcomings

September 8, 2010
By
Julien on R shortcomings

Julien Cornebise posted a rather detailed set of comments (from Jasper!) that I thought was interesting and thought-provoking enough (!) to promote to a guest post. Here it is , then, to keep the debate rolling (with my only censoring being the removal of smileys!). (Please keep in mind that I do not endorse everything

Read more »