Web Development with R – an HD video tutorial of Jeroen Ooms talk

February 3, 2010
By

Here is a HD version of a video tutorial on web development with R, a lecture that was given by Jeroen Ooms (the guy who made A web application for R’s ggplot2). This talk was given at the Bay Area UseR Group meeting on R-Powered Web Apps. You can also view the slides for his talk and view (great) examples for: stockplot, lme4, and gpplot2. Thanks...

Read more »

Predicting the Locations of ‘Emergency’ Ushahidi Reports in Port-au-Prince, and Implications for Crowdsourcing

February 2, 2010
By
Predicting the Locations of ‘Emergency’ Ushahidi Reports in Port-au-Prince, and Implications for Crowdsourcing

Recently, Patrick Meier, PhD candidate at Tufts University and member of the Ushahidi Advisory Board, provided me with a dataset containing the first 72 hours of reports registered with Ushahidi in Port-au-Prince after the January 12th earthquake. First, a huge thank you to Patrick for providing me with this data and the opportunity to analyze

Read more »

In case you missed it: January roundup

February 2, 2010
By

In case you missed them, here are some articles from last month of particular interest to R users. This post linked to slides and video from a 30-minute "Introduction to R" talk I gave on January 28, with links to many useful R resources. This post brought news that R's creators Robert Gentleman and Ross Ihaka have jointly won...

Read more »

Survey: Share your thoughts about predictive models with Aberdeen Group

February 2, 2010
By

Analyst firm Aberdeen Group is conducting research into the use of predictive models in business with a 10-minute survey. It's focused mainly on businesses that are using (or plan to use) predictive models to forecast aspects of their business and the systems they have in place (or plan to put in place) to do so. If you're using predictive...

Read more »

The Power to … What did you say?

February 2, 2010
By
The Power to … What did you say?

It is just about a year ago (exactly January 6th, 2009) that a New York Times article on R did fuel the dispute on what statistical analysis tool is “the best”. One of the highlight of the article was a quote from SAS’ Anne H. Milley: “I think it addresses a niche market for high-end

Read more »

Ensemble Prediction

February 2, 2010
By
Ensemble Prediction

Weather is unpredictable. Small differences in initial conditions can develop into big differences in the pattern of circulation, in the timing and location of cyclones, rainfall etc. This is true no matter how good the initial observing system is. The approach taken by organisations such as ECMWF or NCEP is to re-run numerical forecast models

Read more »

Practical Implementation of Neural Network based Time Series (Stock) Prediction – PART 3

February 1, 2010
By
Practical Implementation of Neural Network based Time Series (Stock) Prediction – PART 3

Ok, now that we have seen how well the perfect sine wave signal was learned, let's turn it up a notch and see how well the complex sine wave was learned.Fig 1. Summary of Actual Vs. Predicted out of sample complex sine waveformUh Oh. What happened, the...

Read more »

InfoWorld: SAS and SPSS rise to R opportunity

February 1, 2010
By

At InfoWorld's "Open Source" blog Salvio Rodrigues found R co-inventor Robert Gentleman's appointment to the REvolution Computing board "a great impetus for me to look at R again". He notes that both SAS and SPSS have recognized the opportunity presented by R: I suspect that SPSS and SAS made their individual decisions based on three factors. First, they likely...

Read more »

R Tutorial Series: Regression With Categorical Variables

February 1, 2010
By
R Tutorial Series: Regression With Categorical Variables

Categorical predictors can be incorporated into regression analysis, provided that they are properly prepared and interpreted. This tutorial will explore how categorical variables can be handled in R.Tutorial FilesBefore we begin, you may want to download the sample data (.csv) used in this tutorial. Be sure to right-click and save the file to your R working directory....

Read more »

R Tutorial Series: Regression With Categorical Variables

February 1, 2010
By
R Tutorial Series: Regression With Categorical Variables

Categorical predictors can be incorporated into regression analysis, provided that they are properly prepared and interpreted. This tutorial will explore how categorical variables can be handled in R.Tutorial FilesBefore we begin, you may want to download the sample data (.csv) used in this tutorial. Be sure to right-click and save the file to your R working directory....

Read more »

Some Python Nooks and Crannies

January 31, 2010
By
Some Python Nooks and Crannies

I spent this weekend reading Learning Python (Second Edition for Python 2.3!) by Mark Lutz. Python is my favorite programming language, but my experience with it has been mostly anecdotal; I come up with my own solutions and functions and I Google whatever I do not know. I decided to spend a couple of days with this incredibly out-of-date...

Read more »

Rcpp 0.7.4

January 31, 2010
By

Yesterday, and about nine days after release 0.7.3 of Rcpp (a set of R / C++ interface classes), Romain and I released version 0.7.4. It has been uploaded to CRAN and Debian, and mirrors should already have new versions. As before, my local page is als...

Read more »

Rcpp 0.7.4

January 31, 2010
By

Yesterday, and about nine days after release 0.7.3 of Rcpp (a set of R / C++ interface classes), Romain and I released version 0.7.4. It has been uploaded to CRAN and Debian, and mirrors should already have new versions. As before, my local page is ...

Read more »

With With

January 31, 2010
By

No that is not a typo in the title. In my programming a came across a solution that I thought was pretty cool. I have a function that basically takes two objects and passes the elements of the objects to another function as arguments. This is a pret...

Read more »

Congruential generators all are RANDUs!

January 30, 2010
By
Congruential generators all are RANDUs!

In case you did not read all the slides of Regis Lebrun’s talk on pseudo-random generators I posted yesterday, one result from Marsaglia’s (in a 1968 PNAS paper) exhibited my ignorance during Regis’ Big’ MC seminar on Thursday. Marsaglia indeed showed that all multiplicative congruential generators lie on a series of hyperplanes whose number gets ridiculously

Read more »

Practical Implementation of Neural Network based time series (stock) prediction – PART 2

January 30, 2010
By
Practical Implementation of Neural Network based time series (stock) prediction – PART 2

As a brief follow up to the series, I want to take a moment to describe a bit about Weka, which is the machine learning tool that we will be using to implement the neural network. It is a fantastic open source JAVA based tool that was developed at the...

Read more »

Mining Tuition Data for US Colleges and Universities, and a Tangent

January 30, 2010
By
Mining Tuition Data for US Colleges and Universities, and a Tangent

I wrote this script for the UCLA Statistical Consulting Center. I don’t know all of the specifics, but one of our faculty members has this idea that we can help our paper, The Daily Bruin, with their graphics or something to that effect. I don’t quite understand because our paper has never really been big on graphics for data,...

Read more »

Practical Implementation of Neural Network based time series (stock) prediction – PART 1

January 29, 2010
By
Practical Implementation of Neural Network based time series (stock) prediction  – PART 1

The following introduction is to allow viewers to understand the basic concepts and practical implementation of neural nets towards a financial time series. I will not go too deep into detail about the mathematics behind the neural net at the moment. ...

Read more »

Big’MC seminar

January 29, 2010
By
Big’MC seminar

Two very interesting talks at the Big’ MC seminar on Thursday: – Phylogenetic models and MCMC methods for the reconstruction of language history by Robin Ryder – Uniform and non-uniform random generators by Régis Lebrun which are both on topics close to my interest, evolution of languages (I’ll be a philologist in another life!) and uniform random generators. Filed

Read more »

R creators win prestigious Statistical Computing and Graphics Award

January 29, 2010
By

The American Statistical Association recently created a new, bi-annual award to to recognize an individual or team for innovation in computing, software, or graphics that has had a great impact on statistical practice or research. The committee has just announced the winner (or in this, joint winners) of the first award: Robert Gentleman and Ross Ihaka, for their work...

Read more »

Crayola crayon colors, 1949-present

January 29, 2010
By
Crayola crayon colors, 1949-present

Here's an example I featured in my list of 7 Awesome Things about R (awesome thing #3: graphics and data visualization). The Learning R blog features a reproduction of a graphic that recently appeared on Flowing Data. It shows the colors in a box of Crayola crayons: before 1949 there were only 8, but over the years additional colors...

Read more »

Looking for a Bayésien PhD

January 28, 2010
By
Looking for a Bayésien PhD

I just got this email (yes, in French) looking for a Bayesian ready to work on algorithms: Dans le cadre de la société Vekia, nous recherchons un Docteur en statistiques bayésiennes pour un poste sur Lille à pourvoir dès que possible. Vekia est  un éditeur de logiciel pour le commerce fondée en 2007 par deux chercheurs (Pierre-Arnaud

Read more »

Introduction to R webinar today, slides available

January 28, 2010
By

Just a quick reminder that I'll be hosting an introductory webinar about R today, The R Project: Data Analysis and Statistical Graphics for the Enterprise. It's at 9AM Pacific, so you might still have time to register for the live session at the link below. Otherwise, if you did catch the live session, you can pick up the slides...

Read more »

Advanced Graphics in R

January 27, 2010
By
Advanced Graphics in R

Each quarter the UCLA Statistical Consulting Center hosts minicourses twice per week in R and LaTeX. Tonight was my turn to present. I presented Advanced Graphics in R. This was the same presentation I gave at the LA R Users’ Group in August will a fellow consultant. She and I had trouble coming together to make one presentation, so we...

Read more »

From the “blogosphere”? Hardly.

January 27, 2010
By
From the “blogosphere”? Hardly.

I generally skip over “From the Blogosphere”, a (mostly) weekly-summary of one or two blog posts in Nature’s “Authors” section (here is the latest). Why? Well, I’ve always suspected that the title is rather misleading. Now, I have the hard numbers to prove it. My feed reader contains an archive of 128 articles, dating back

Read more »

Re-mapping Massachusetts Special election results

January 27, 2010
By
Re-mapping Massachusetts Special election results

I had previously posted maps showing the difference in major party vote share between the 2008 Presidential election and the 2010 special Senate election in Massachusetts. Colleagues and readers of the Revolutions blog had some very insightful criticisms of these maps, in particular that the color scale was over-stating the swing in voter sentiment. I’ve

Read more »

How to combine Google maps and data in R

January 27, 2010
By
How to combine Google maps and data in R

Every good artist needs a canvas, and when it comes to displaying geographic data placing those data in context -- on a map -- makes all the difference. A new package for R from Markus Loecher, RgoogleMaps, allows you to download a street or satellite map from Google simply by specifying the bounding latitude/longitude coordinates. (You need to sign...

Read more »

Bayesian courses in København

January 26, 2010
By
Bayesian courses in København

I received this announcement about two incoming courses given in København by Andrew Lawson: 1) “*An Introduction to Bayesian Disease Mapping*” A Two-Day Course, April 12.- 13. 2010, University of Southern Denmark This course is designed to provide an introduction to the area of Bayesian disease mapping in applications to Public Health and Epidemiology: 2) “*Advanced Bayesian Disease Mapping*” A

Read more »

What programmers should know about Statistics

January 26, 2010
By

Reader KW pointed me to this rant essay from Ruby on Rails enfant terrible Zed Shaw on what computer programmers don't know about statistical analysis, but should. (Spoiler alert: a lot, apparently.) Perhaps surprisingly, building complex software systems often involves a lot of simulation, experimentation, and measurement for which statistical methods would be an asset. But according to Shaw,...

Read more »