## Visualization Series: Using Scatterplots and Models to Understand the Diamond Market (so You Don’t Get Ripped Off)

January 19, 2014
My last post railed against the bad visualizations that people often use to plot quantitive data by groups, and pitted pie charts, bar charts and dot plots against each other for two visualization tasks.  Dot plots came out on top. … Continue reading →

## Porn capital of the porn nation

January 10, 2014
The other day I was having a quick look to the newspapers and I stumbled on this article. Apparently, Pornhub (a website whose mission should be pretty clear) have analysed the data on their customers and found out that the town of Ware (Hertfords...

## The realized GARCH model

January 2, 2014
The last model added to the rugarch package dealt with the modelling of intraday volatility using a multiplicative component GARCH model. The newest addition is the realized GARCH model of Hansen, Huang and Shek (2012) (henceforth HHS2012) which relates the realized volatility measure to the latent volatility using a flexible representation with asymmetric dynamics. This

## Plotly Beta: Collaborative Plotting with R

December 16, 2013
(Guest post by Matt Sundquist on a lovely new service which is pro-actively supporting an API for R) The Plotly R graphing library  allows you to create and share interactive, publication-quality plots in your browser. Plotly is also built for …Read more »

## 24 Days of R: Day 10

December 10, 2013
How often is someone nominated for an academy award? Who has been nominated most often? Is there a difference between leading and supporting roles? Important questions. To answer them, I'm making use of a list of academy award nominees and winners. I've obtained the data from aggdata.com which has a few sets of free data.

## Conditional densities, on one single graph

December 5, 2013
With Stéphane Tufféry we’ve been working on credit scoring1 and we’ve been using the popular german credit dataset, > myVariableNames <- c("checking_status","duration","credit_history", + "purpose","credit_amount","savings","employment","installment_rate", + "personal_status","other_parties","residence_since","property_magnitude", + "age","other_payment_plans","housing","existing_credits","job", + "num_dependents","telephone","foreign_worker","class") > credit = read.table( + "http://archive.ics.uci.edu/ml/machine-learning-databases/statlog/german/german.data", + header=FALSE,col.names=myVariableNames) > credit$class <- credit$class-1 We wanted to get a nice code to produce a graph like the one below, Yesterday, Stéphane...

## Importance sampling schemes for evidence approximation in mixture models

November 26, 2013
Jeong Eun (Kate) Lee and I completed this paper, “Importance sampling schemes for evidence approximation in mixture models“, now posted on arXiv. (With the customary one-day lag for posting, making me bemoan the days of yore when arXiv would give a definitive arXiv number at the time of submission.) Kate came twice to Paris in the past

## Bootstrapping for Propensity Score Analysis

November 26, 2013
I am happy to announce that version 1.0 of the PSAboot package has been released to CRAN. This package implements bootstrapping for propensity score analysis. This deviates from typical implementations such as boot in that it allows for separate sampling specifications for treatment and control units. For example, in the case where the ratio of treatment-to-control units is...