Blog Archives

Conditioning and Grouping with Lattice Graphics

February 17, 2014
By
Conditioning and Grouping with Lattice Graphics

Conditioning and grouping are two important concepts in graphing that allow us to rapidly refine our understanding of data under consideration. Conditioning, in particular, allows us to view relationships across “panels” with common scales. Each panel contains a plot whose data is “conditional” upon records drawn from the category that supports that particular panel (an

Read more »

Rstudio starts to codefold markdown

September 16, 2013
By

Rstudio is a great tool for working with R and R scripts. And Markdown is a great way to write even complex, reproducible documents in plain text. So they make a great combination. BUT: before when writing markdown in rstudio, you had to write “—-” after your headings to get it to codefold markdown headings,

Read more »

Vectors, Looping, and Performance

September 7, 2013
By
Vectors, Looping, and Performance

Vectors are at the heart of R and represent a true convenience. Moreover, vectors are essential for good performance especially when your are working with lots of data. We’ll explore these concepts in this posting. As a motivational example let’s generate a sequence of data from -3 to 3. We’ll also use each point as

Read more »

Omni test for statistical significance

May 9, 2013
By
Omni test for statistical significance

In survey research, our datasets nearly always comprise variables with mixed measurement levels – in particular, nominal, ordinal and continuous, or in R-speak, unordered factors, ordered factors and numeric variables. Sometimes it is useful to be able to do blanket tests of one set of variables (possibly of mixed level) against another without having to

Read more »

Building a custom database of country time-series data using Quandl

May 8, 2013
By
Building a custom database of country time-series data using Quandl

Encouraged by this post I had another look at quandl for collecting datasets from different agencies. Right now I need to get data for four countries on a couple of dozen indicators. This graphic is just a quick example with only two indicators of what I am aiming to be able to do. The process

Read more »

Changing figure options mid-chunk (in a loop) using the pander package.

April 9, 2013
By
Changing figure options mid-chunk (in a loop) using the pander package.

I wrote already about changing figure options mid-chunk in reproducible research. This can be important  e.g. if you are looping through a dataset to produce a graphic for each variable but the figure width or height need to depend on properties of the variables, e.g. if you are producing histograms and want the figures to

Read more »

GeoCoding,R, and The Rolling Stones – Part 2

March 20, 2013
By
GeoCoding,R, and The Rolling Stones – Part 2

Welcome to Part 2 of the GeoCoding, R, and the Rolling Stones blog. Let’s apply some of the things we learned in Part 1 to a practical real world example. Mapping the Stones – A Real Example The Rolling Stones have toured for many years. You can go to Wikipedia and see information on the

Read more »

GeoCoding, R, and The Rolling Stones – Part 1

March 20, 2013
By
GeoCoding, R, and The Rolling Stones – Part 1

In this article I discuss a general approach for Geocoding a location from within R, processing XML reports, and using R packages to create interactive maps. There are various ways to accomplish this, though using Google’s GeoCoding service is a good place to start. We’ll also talk a bit about the XML package that is

Read more »

knitr: Changing chunk options like fig.height programmatically, mid-chunk

February 22, 2013
By

Knitr is a great tool for doing reproducible research. You can produce all kinds of output inside a single knitr chunk, e.g. you can write a loop to produce lots of figures or tables. The only catch is if you want your figures to have differing captions, heights, etc (and usually you do). The standard

Read more »

Apply Yourself !

February 13, 2013
By
Apply Yourself !

Hello. Welcome to my debut post ! Check the About link to see what this Blog intends to accomplish. In this article I discuss a general approach for dealing with the problem of splitting a data frame based on a grouping variable and then doing some more operations per group. A secondary goal is to

Read more »