Monthly Archives: February 2012

GSoC Project #2 for 2012

February 23, 2012
By
GSoC Project #2 for 2012

In my prior post, I discussed the origins of the first GSoC project I posted this year. The second GSoC project I’ve proposed is around the writing and code of Attilio Meucci, an adjunct professor at Baruch College – CUNY and an excellent speaker (I saw him at the University of Chicago when he spoke

Read more »

Large-scale Inference

February 23, 2012
By
Large-scale Inference

Large-scale Inference by Brad Efron is the first IMS Monograph in this new series, coordinated by David Cox and published by Cambridge University Press. Since I read this book immediately after Cox’ and Donnelly’s Principles of Applied Statistics, I was thinking of drawing a parallel between the two books. However, while none of them can

Read more »

Pocketbook costs of software

February 23, 2012
By
Pocketbook costs of software

I have always been provided SAS as part of my job, so I never really realized how much it cost. I’ve bought Stata before, and of course R . I recently found out how much a reasonable bundle of SAS modules along with base SAS costs per year per seat, at least under the GSA.

Read more »

PCA for NIR Spectra_part 002: "Score planes"

February 23, 2012
By
PCA for NIR Spectra_part 002: "Score planes"

The idea of this post is to compare the score plots for the first 3 principal components obtained with the algorithm “svd” with the scores plot of  other chemometric software (Win ISI in this case). Previously I had exported the yarn spectra t...

Read more »

Prediction: the Lasso vs. just using the top 10 predictors

February 23, 2012
By
Prediction: the Lasso vs. just using the top 10 predictors

One incredibly popular tool for the analysis of high-dimensional data is the lasso. The lasso is commonly used in cases when you have many more predictors than independent samples (the n « p) problem. It is also often used in the context of predictio...

Read more »

Visualization in regression analysis

February 23, 2012
By
Visualization in regression analysis

Visualization is a key to success in regression analysis. This is one of the (many) reasons I am also suspicious when I read an article with a quantitative (econometric) analysis without any graph. Consider for instance the following dataset, obtai...

Read more »

Example 9.21: The birthday "problem" re-examined

February 23, 2012
By
Example 9.21: The birthday "problem" re-examined

The so-called birthday paradox or birthday problem is simply the counter-intutitive discovery that the probability of (at least) two people in a group sharing a birthday goes up surprisingly fast as the group size increases. If the group is only 23 peo...

Read more »

Gini index and Lorenz curve with R

February 23, 2012
By
Gini index and Lorenz curve with R

You can do anything pretty easily with R, for instance, calculate concentration indexes such as the Gini index or display the Lorenz curve (dedicated to my students). Although I did not explain it during my lectures, calculating a Gini index or displaying the Lorenz curve can be done very easily with R. All you have

Read more »

Maps with R (III)

February 23, 2012
By
Maps with R (III)

In my previous posts (1 and 2) I wrote about maps with complex legends but without any kind of interactivity. …

Continuar leyendo »

Read more »

another X’idated question

February 23, 2012
By
another X’idated question

An X’idated reader of Monte Carlo Statistical Methods had trouble with our Example 3.13, the very one our academic book reviewer disliked so much as to “diverse a 2 star”. The issue is with computing the integral when f is the Student’s t(5) distribution density. In our book, we compare a few importance sampling solutions,

Read more »