# Monthly Archives: February 2013

## Convenience Sample, SRS, and Stratified Random Sample Compared

February 4, 2013
By

In class today we were discussing several types of survey sampling and we split into groups and did a little investigation. We were given a page of 100 rectangles with varying areas and took 3 samples of size 10. Our first was a convenience sample. We...

## Help needed with sample selection biases

February 4, 2013
By

We are searching for a graduate student to assist us on a very short assignment about sample selection biases and Heckman Probit models. The help is not needed for estimating the models, but instead for reviewing the scenarios where the use of such models is theoretically appropriate or otherwise. For instance, we are particularly interested in determining if Heck...

## Generating Labels for Supervised Text Classification using CAT and R

February 4, 2013
By

The explosion in the availability of text has opened new opportunities to exploit text as data for research. As Justin Grimmer and Brandon Stewart discuss in the above paper, there are a number of approaches to reducing human text to … Continue reading →

## Landmine detection revisited; the inverse unicorn problem

February 4, 2013
By

A couple weeks ago I wrote about an interesting idea to clear landmines using the power of the wind. A reader asked me to comment more on the value of using these wind-powered “Kafons” to do an initial assay of a suspected minefield, an idea I mentioned at the end of my video on the

## An infelicity with Value at Risk

February 4, 2013
By

More risk does not necessarily mean bigger Value at Risk. Previously “The incoherence of risk coherence” suggested that the failure of Value at Risk (VaR) to be coherent is of little practical importance. Here we look at an attribute that is not a part of the definition of coherence yet is a desirable quality. Thought … Continue reading...

## analyze the survey of income and program participation (sipp) with r

February 4, 2013
By

if the census bureau's budget was gutted and only one complex sample survey survived, pray it's the survey of income and program participation (sipp).  it's giant.  it's rich with variables.  it's monthly.  it follows households over three, four, now five year panels.  the congressional budget office uses it for their health insurance simulation.  analysts read that sipp has...

## Proposed techniques for communicating the amount of information contained in a statistical result

February 4, 2013
By
$Proposed techniques for communicating the amount of information contained in a statistical result$

A couple of weeks ago, I posted about how much we can expect to learn about the state of the world on the basis of a statistical significance test. One way of framing this question is: if we’re trying to come to scientific conclusions on the basis of statistical results, how much can we update

## Data Visualization for Education

February 3, 2013
By

Recently I was invited to give a talk to two cohorts of Strategic Data Project fellows. I was asked to speak about using data visualization to help inform decision-making of policy makers. At the same time, the group had a lot of variation in their int...

## A Grid Search for The Optimal Setting in Feed-Forward Neural Networks

February 3, 2013
By

The feed-forward neural network is a very powerful classification model in the machine learning content. Since the goodness-of-fit of a neural network is majorly dominated by the model complexity, it is very tempting for a modeler to over-parameterize the neural network by using too many hidden layers or/and hidden units. As pointed out by Brian

## Japanese Government Bonds (JGB) Total Return Series

February 3, 2013
By

In a follow up to Yen and JGBs Short-Term vs Long Term and a series of posts on Japan, I thought the Bloomberg article "Japan Pension Fund’s Bonds Too Many If Abe Succeeds, Mitani Says" was particularly interesting.  It is difficult to find a to...