Blog Archives

Applying an Operation to a List of Variables

October 14, 2013
By

Just a quick note on a short hack that I cobbled together this morning. I have an analysis where I need to perform the same set of operations to a list of variables. In order to do this in a compact and robust way, I wanted to write a loop that would run through the

Read more »

Top 250 Movies at IMDb

October 2, 2013
By
Top 250 Movies at IMDb

Some years ago I allowed myself to accept a challenge to read the Top 100 Novels of All Time (complete list here). This list was put together by Richard Lacayo and Lev Grossman at Time Magazine. To start with I could tick off a number of books that I had already read. That left me with around 75

Read more »

Citations for using Stan?

September 23, 2013
By
Citations for using Stan?

Bob writes: If you have papers that have used Stan, we’d love to hear about it. We finally got some submissions, so we’re going to start a list on the web site for 2.0 in earnest. You can either mail them to the list, to me directly, or just update the issue (at least until The post Citations...

Read more »

Clustering Lightning Discharges to Identify Storms

September 13, 2013
By

A short talk that I gave at the LIGHTS 2013 Conference (Johannesburg, 12 September 2013). The slides are relatively devoid of text because I like the audience to hear the content rather than read it. The central message of the presentation is that clustering lightning discharges into storms is not a trivial task, but still

Read more »

Clustering the Words of William Shakespeare

September 10, 2013
By
Clustering the Words of William Shakespeare

In my previous post I used the tm package to do some simple text mining on the Complete Works of William Shakespeare. Today I am taking some of those results and using them to generate word clusters. Preparing the Data I will start with the Term Document Matrix (TDM) consisting of 71 words commonly used by

Read more »

Text Mining the Complete Works of William Shakespeare

September 5, 2013
By
Text Mining the Complete Works of William Shakespeare

I am starting a new project that will require some serious text mining. So, in the interests of bringing myself up to speed on the tm package, I thought I would apply it to the Complete Works of William Shakespeare and just see what falls out. The first order of business was getting my hands

Read more »

Presenting Conformance Statistics

August 27, 2013
By
Presenting Conformance Statistics

A client came to me with some conformance data. She was having a hard time making sense of it in a spreadsheet. I had a look at a couple of ways of presenting it that would bring out the important points. The Data The data came as a spreadsheet with multiple sheets. Each of the

Read more »

The Wonders of foreach

August 25, 2013
By
The Wonders of foreach

Writing code from scratch to do parallel computations can be rather tricky. However, the packages providing parallel facilities in R make it remarkably easy. One such package is foreach. I am going to document my trail of discovery with foreach, which began some time ago, but has really come into fruition over the last few

Read more »

Fitting a Model by Maximum Likelihood

August 18, 2013
By
Fitting a Model by Maximum Likelihood

Maximum-Likelihood Estimation (MLE) is a statistical technique for estimating model parameters. It basically sets out to answer the question: what model parameters are most likely to characterise a given set of data? First you need to select a model for the data. And the model must have one or more (unknown) parameters. As the name

Read more »

Finding Correlations in Data with Uncertainty: Classical Solution

August 13, 2013
By

Following up on my previous post as a result of an excellent suggestion from Andrej Spiess. The data are indeed very heteroscedastic! Andrej suggested that an alternative way to attack this problem would be to use weighted correlation with weights being the inverse of the measurement variance. Let’s look at the synthetic data first. This is

Read more »