Monthly Archives: February 2013

2011 Census Open Atlas Project

February 5, 2013
By
2011 Census Open Atlas Project

This month has seen the release of the 2011  census data for England and Wales at Output Area Level. This offers the possibility to map various attributes about people and places for very small geographic areas. Output Areas represent the most detailed geography for which Census data are released and are the building blocks for

Read more »

Tables from R into Word

February 5, 2013
By
Tables from R into Word

A good looking table matters! This tutorial is on how to create a neat table in Word by combining knitr and R Markdown. I'll be using my own function, htmlTable, from the Gmisc package. Background: Because most journals that I submit to want...

Read more »

Proposed techniques for communicating the amount of information contained in a statistical result

February 5, 2013
By
Proposed techniques for communicating the amount of information contained in a statistical result

A couple of weeks ago, I posted about how much we can expect to learn about the state of the world on the basis of a statistical significance test. One way of framing this question is: if we’re trying to come to scientific conclusions on the basis of statistical results, how much can we update

Read more »

2011 Census Open Atlas Project

February 5, 2013
By
2011 Census Open Atlas Project

This month has seen the release of the 2011  census data for England and Wales at Output Area Level. This offers the possibility to map various attributes about people and places for very small geographic areas. Output Areas represent the most detailed geography for which Census data are released and are the building blocks for many popular products...

Read more »

Next Kölner R User Meeting: 6 February 2013

February 5, 2013
By
Next Kölner R User Meeting: 6 February 2013

Quick reminder: The next Cologne R user group meeting is scheduled for tomorrow, 6 February 2013. All details and the agenda are available on the KölnRUG Meetup site. Please sign up if you would like to come along. Notes from the last Cologne R user group meeting are available here. Thanks also to...

Read more »

Tracking Number of Historical Clusters in DOW 30 and S&P 500

February 4, 2013
By
Tracking Number of Historical Clusters in DOW 30 and S&P 500

In the Tracking Number of Historical Clusters post, I looked at how 3 different methods were able to identify clusters across the 10 major asset universe. Today, I want to share the impact of clustering on the larger universe. Below I examined the historical time series of number of clusters in the DOW 30 and

Read more »

Visualizing networks in R: arc diagrams and hive plots

February 4, 2013
By
Visualizing networks in R: arc diagrams and hive plots

Arc diagrams are an alternate way of representing two-dimensional graphs. Rather than scattering the nodes across the page connected by straight edges, you can instead arrange the nodes along a one-dimensional axis, and replace the straight edges with arcs between the nodes. While an arc diagram might not give as good a sense of the connections between the nodes...

Read more »

Convenience Sample, SRS, and Stratified Random Sample Compared

February 4, 2013
By
Convenience Sample, SRS, and Stratified Random Sample Compared

In class today we were discussing several types of survey sampling and we split into groups and did a little investigation. We were given a page of 100 rectangles with varying areas and took 3 samples of size 10. Our first was a convenience sample. We...

Read more »

Help needed with sample selection biases

February 4, 2013
By

We are searching for a graduate student to assist us on a very short assignment about sample selection biases and Heckman Probit models. The help is not needed for estimating the models, but instead for reviewing the scenarios where the use of such models is theoretically appropriate or otherwise. For instance, we are particularly interested in determining if Heck...

Read more »

Generating Labels for Supervised Text Classification using CAT and R

February 4, 2013
By
Generating Labels for Supervised Text Classification using CAT and R

The explosion in the availability of text has opened new opportunities to exploit text as data for research. As Justin Grimmer and Brandon Stewart discuss in the above paper, there are a number of approaches to reducing human text to … Continue reading →

Read more »