RcppArmadillo 0.2.17

March 23, 2011
By

Another release (1.1.90) by Conrad Sanderson for his wonderful Armadillo templated C++ library for linear algebra appeared yesterday. Consequently, a new release 0.2.17 of RcppArmadillo, our Rcpp-based integration into R is now on CRAN mirrors. The...

Read more »

The Register profiles Revolution Analytics

March 23, 2011
By

Tech news site The Register has just published an in-depth profile of Revolution Analytics. It was great meeting the author Dan Olds at Revolution HQ a couple of weeks ago, and sharing with him why we think the R language is the way forward for data science: modern, applied, large-scale statistical analysis. He captures that sentiment perfectly in the...

Read more »

Graphical Display of R Package Dependencies

March 23, 2011
By
Graphical Display of R Package Dependencies

In some work that I am currently involved in, we have to decide which GUI engine we should use. As an obvious starter, we decided to have a look at what other people are using in their packages. While cran helpfully displays all the R packages that are available, it doesn’t (I don’t think), give

Read more »

Downloading S&P 500 Data to R

March 23, 2011
By

The cornerstone of your analysis and quantitative trading algorithms are data. There are lots of different ways how to do it in R (depending of what your investment instruments are). Today I am going to download data from finance.yahoo which are stock ...

Read more »

Getting into shape for the sport of data science: Screencast of talk by Jeremy Howard at Melbourne R Users

March 23, 2011
By
Getting into shape for the sport of data science: Screencast of talk by Jeremy Howard at Melbourne R Users

Jeremy Howard gave a talk at the Melbourne R User Group on 16th March 2011. Jeremy provided tips on how to successfully compete in data mining competitions. He showed how he combines R with other tools to build predictive models. … Continue reading →

Read more »

Applied R: Manual for the quantitative social scientist

March 23, 2011
By

Applied R for the quantitative social scientist is a manual on R written specifically as an introduction for the quantitative social scientist. To my opinion, R-Project is a magnificent statistical program, ready to be accepted and implemented in the social sciences. The flexibility of this program and the way data are handled gives the user a sense of closeness...

Read more »

sab-R-metrics Sidetrack: Bubble Plots

March 22, 2011
By
sab-R-metrics Sidetrack: Bubble Plots

While I had mentioned in my last post that I will cover logistic regression in my next post, I decided that a quick interlude in working with bubble plots would be fun. Bubble plots have become pretty popular recently, especially with all of the Visualization Challenges I've seen around the internet (by the way, I...

Read more »

sab-R-metrics Sidetrack: Bubble Plots

March 22, 2011
By
sab-R-metrics Sidetrack: Bubble Plots

While I had mentioned in my last post that I will cover logistic regression in my next post, I decided that a quick interlude in working with bubble plots would be fun. Bubble plots have become pretty popular recently, especially with all of the Visualization Challenges I've seen around the internet (by the way, I...

Read more »

R again in Google Summer of Code

March 22, 2011
By

I'm a big fan of the Google Summer of Code.  It brings great projects together with a learning opportunity for students.  Once again the R Project was selected to be part of the Google Summer of Code in 2011.  Some other notable mathemat...

Read more »

R again in Google Summer of Code

March 22, 2011
By

I'm a big fan of the Google Summer of Code.  It brings great projects together with a learning opportunity for students.  Once again the R Project was selected to be part of the Google Summer of Code in 2011.  Some other notable mathemat...

Read more »

Where the heck has JD been?

March 22, 2011
By
Where the heck has JD been?

It’s been pointed out to me that I haven’t had any blog posts in a while. It’s true. I’m fairly slack. But in the last few months I’ve changed jobs (same firm, new role), written an R abstraction on top of Hadoop, been to China, and managed to stay married. While that sounds pretty awesome,

Read more »

Code: extended model support for mtable

March 22, 2011
By
Code: extended model support for mtable

I finally got around to organizing and packaging my complete set of extended model support for mtable in Martin Elff’s memisc library. Here is a list of the models supported: coxph, survreg – Cox proportional hazards models and parametric survival … Continue reading →

Read more »

More on R-Studio

March 22, 2011
By
More on R-Studio

Here's a link to keyboard shortcuts for the RStudio IDE. RStudio has replaced EMacs, Aquamacs, Tinn-R, Bluefish, and even Komodo Edit as my preferred R IDE/editor.http://gettinggeneticsdone.blogspot.com/2011/03/rstudio-keyboard-shortcut-reference-pdf.h...

Read more »

Analysis: R growth continues in popularity of data analysis software

March 22, 2011
By

Bob Muenchen, author of R for SAS and SPSS Users and co-author of R for Stata Users, has updated his in-depth analysis of the popularity of data analysis software. Determining "popularity" for software is a tricky task, but this analysis looks at several different metrics: mailing list traffic, blogs, search volumes, job listings and other such indirect methods, as...

Read more »

One bicycle for two

March 22, 2011
By
One bicycle for two

Robin showed me a mathematical puzzle today that reminded me of a story my grand-father used to tell. When he was young, he and his cousin were working in the same place and on Sundays they used to visit my great-grand-mother in another village. However, they only had one bicycle between them, so they would

Read more »

Free and Easy Currency Monitor in R

March 22, 2011
By
Free and Easy Currency Monitor in R

Certainly not the best way to keep up with currencies, but the increasingly important job of monitoring currencies can be free and easy in R using Federal Reserve FRED data.  Here is a template that can be adjusted to your favorite currencies with...

Read more »

A Short Side-by-side Comparison of the R and NumPy Array Types

March 22, 2011
By

Feature NumPy R contiguous (virtual) memory ✔ ✔ 'view' memory model ✔ ✘ subset-assignment ✔ ✔ vectorized operations ✔ ✔ memory-mapping ✔ ✘* broadcasting rules ✔ ✔ index arrays ✔ ✔ This comparison is current as of R 2.13.0, NumPy version 1.4.1, and other web resources to date. Because this post was motivated by a

Read more »

Visualizing Missing Data

March 22, 2011
By
Visualizing Missing Data

There are several graphics available for visualizing missing data. The following graphic was inspired by many sources. However, I wanted a version using ggplot2. What is visualized here is the percent missing for each variable in the PISA data across countries. The code will be available as part of the multilevelPSA package I am currently

Read more »

Comparison of UAH and GISS Time Series with Common Baseline

March 22, 2011
By
Comparison of UAH and GISS Time Series with Common Baseline

In this post I set both UAH and GISS global temperature anomaly series to a common baseline period (1981-2010)  and compare them. Even though the UAH series is satellite based and GISS series is station based, the series exhibit striking … Continue reading →

Read more »

data.table: an R package everyone should use

March 22, 2011
By

I’m not sure how I missed this package, but I am sure glad I’ve found it. The data.table package for R provides something of a reconceptualization of the standard data.frame object. Though it remains (mostly) compatible with data.frame. The advantage … Continue reading →

Read more »

Correlation network

March 22, 2011
By
Correlation network

I came up with an idea to draw correlation network to get a grasp about relationship between a list of stocks. An alternative way to show correlation matrix would be head map, which can have limitations with big matrices (>100). Unfortunately,  ggplot2 package doesn’t have a easy way to draw the networks, so I was left

Read more »

Example 8.31: Choropleth maps

March 22, 2011
By
Example 8.31: Choropleth maps

In our book, we show a simple example of a map (section 6.4.2) where we read the boundary files as data sets and use SAS and R to plot them. But both SAS and R have complex functionality for using pre-compiled map data. To demonstrate them, we'll sho...

Read more »

Machine Learning Ex2 – Linear Regression

March 22, 2011
By
Machine Learning Ex2 – Linear Regression

Thanks to this post, I found OpenClassroom. In addition, thanks to Andrew Ng and his lectures, I took my first course in machine learning. These videos are quite easy to follow. Exercise 2 requires implementing gradient descent algorithm to model data with linear regression. Read More: 243 Words Totally

Read more »

JCGS 20th anniversary

March 22, 2011
By
JCGS 20th anniversary

For its 20th anniversary, JCGS offers free access to papers, including Andrew’s discussion paper Why tables are really much better than graphs. (Another serious ending for an April fool joke!) Incidentally (or rather coincidentally), I received today the great news that our Using parallel computation to improve Independent Metropolis-Hastings based estimation paper is accepted by

Read more »

Day #9 Using R in Knime nodes

March 22, 2011
By

First you need to create a workflow in Knime. This is what i used. I loaded in the Iris data, renamed the tables for further use in my scripts and showed a view, or first did an R snippet to show a view afterwards. Once this is done, make sure your R-B...

Read more »

Using R for Introductory Statistics 6, Simulations

March 21, 2011
By
Using R for Introductory Statistics 6, Simulations

R can easily generate random samples from a whole library of probability distributions. We might want to do this to gain insight into the distribution's shape and properties. A tricky aspect of statistics is that results like the central limit theorem come with caveats, such as "...for sufficiently large n...". Getting a feel for how...

Read more »

Using R for Introductory Statistics 6, Simulations

March 21, 2011
By
Using R for Introductory Statistics 6, Simulations

R can easily generate random samples from a whole library of probability distributions. We might want to do this to gain insight into the distribution's shape and properties. A tricky aspect of statistics is that results like the central limit theore...

Read more »

Statistics forum

March 21, 2011
By
Statistics forum

The ASA is launching a new blog called the Statistics Forum, managed by Andrew Gelman and to which I will periodically contribute items that may induce some amount of discussion within the community, like the first entry by Michael Lavine on testing. (Meaning I will double-post on the Og and on the Statistics Forum, if

Read more »

A 3D Version of R’s curve() Function

March 21, 2011
By
A 3D Version of R’s curve() Function

I like exploring the behavior of functions of a single variable using the curve() function in R. One thing that seems to be missing from R’s base functions is a tool for exploring functions of two variables. I asked for examples of such a function on Twitter today and didn’t get any answers, so I

Read more »