# Monthly Archives: September 2013

## Generate and Retrieve Many Objects with Sequential Names

September 8, 2013
By

While coding ensemble methods in data mining with R, e.g. bagging, we often need to generate many data and models objects with sequential names. Below is a quick example how to use assign() function to generate many prediction objects on the fly and then retrieve these predictions with mget() to do the model averaging.

## Mixed models; Random Coefficients, part 1

September 8, 2013
By

Continuing with my exploration of mixed models I am now at the first part of random coefficients: example 59.5 for proc mixed (page 5034 of the SAS/STAT 12.3 Manual). This means I skipped examples 59.3 (plotting the likelihood) and 59.4 (known G and R)...

## Rforecastio – Simple R Package To Access forecast.io Weather Data

September 8, 2013
By

It doesn’t get much better for me than when I can combine R and weather data in new ways. I’ve got something brewing with my Nest thermostat and needed to get some current wx readings plus forecast data. I could have chosen a number of different sources or API’s but I wanted to play with

## Maximum Likelihood Estimation and the Origin of Life

September 8, 2013
By

# Maximum likelihood Estimation (MLE) is a powerful tool in econometrics which allows for the consistent and asymptotically efficient estimation of parameters given a correct identification (in terms of distribution) of the random variable. # It i...

## Visualizing optimization process

September 8, 2013
By

One of the approaches to graph drawing is application of so called force-directed algorithms. In its simplest form the idea is to layout the nodes on plane so that all edges in the graph have approximately equal length. This problem has very intuitive ...

## Linear regression from a contingency table

September 7, 2013
By

This morning, Benoit sent me an email, about an exercise he found in an econometric textbook, about linear regression. Consider the following dataset, Here, variable X denotes the income, and Y the expenses. The goal was to fit a linear regression (actually, in the email, it was mentioned that we should try to fit an heteroscedastic model, but let...

## Vectors, Looping, and Performance

September 7, 2013
By

Vectors are at the heart of R and represent a true convenience. Moreover, vectors are essential for good performance especially when your are working with lots of data. We’ll explore these concepts in this posting. As a motivational example let’s generate a sequence of data from -3 to 3. We’ll also use each point as

## A bit of benchmarking with string distances

September 7, 2013
By

After my last post about the stringdist package, Zachary Mayer pointed out to me that the implementation of the Levenshtein and Jaro-Winkler distances implemented in the RecordLinkage package are about two-three times faster. His benchmark compares randomly generated character strings … Continue reading →

## First post, and its a doozy!

September 7, 2013
By

Well, not really a doozy.  Just something nice and slow to get me going. So, seeing as I intend to post stuff about R along with the other things, I thought it best to understand how all those great R bloggers embed the highlighted R code into their WordPress blogs.  As it turns out, I

## Fearsome Engines, Part 1

September 7, 2013
By

Back in June I discovered pqR, Radford Neal’s fork of R designed to improve performance. Then in July, I heard about Tibco’s TERR, a C++ rewrite of the R engine suitable for the enterprise. At this point it dawned on me that R might end up like SQL, with many different implementations of a common