## Change fonts in ggplot2, and create xkcd style graphs

February 17, 2013
Installing and changing fonts in your plots comes now easy with the extrafonts-package. There is a excellent tutorial on the extrafonts github site, still I will shortly demonstrate how it worked for me. First, install the package and load it. You can now install the desired system fonts (at the moment only TrueType fonts): The

## Temporal network model – Barabási-Albert model with the library igraph

February 17, 2013
I found a golden website. The blog of Esteban Moro. He uses R to work on networks. In particular he has done a really nice code to make some great videos of networks. This post is purely a copy of his code. I just changed a few arguments to change colors and to do my own network.To...

## Run production, one team at a time

February 17, 2013
In a previous post, I used R to process data from the Lahman database to calculate index values that compare a team's run production to the league average for that year.  For the purpose of that exercise, I started the sequence at 1947, but for what follows I re-ran the code with the time period...

## Automatic spatial interpolation with R: the automap package

February 17, 2013
In case of continuously collected data, e.g. observations from a monitoring network, spatial interpolation of this data cannot be done manually. Instead, the interpolation should be done automatically. To achieve this goal, I developed the automap package. automap builds on… See more ›

## A look at strucchange and segmented

February 17, 2013
After last week's post it was commented that strucchange and segmented would be more suitable for my purpose. I had a look at both. Strucchange can find a jump in a time series, which was what I was looking for. In contrast segmented is more suitable f...

## Contribute to The R Journal with LyX/knitr

February 17, 2013
(This paragraph is pure rant; feel free to skip it) I have been looking forward to the one-column LaTeX style of The R Journal, and it has arrived eventually. Last time I mentioned "it does not make sense to sell the cooked shrimps"; actually there is ...

## Gist for previous posts

February 17, 2013
The more I use it, the more I understand the benefits and value of Github as a code-sharing resource. The gist found here is the R code for my posts on run scoring trends by league (found here, here, and here).  I will continue to use Github for t...

## Interactive stage-structured population model

February 16, 2013
This is an example of interfacing R and shiny to allow users to explore a biological model often encountered in an introductory ecology class. We are interested the growth of a population that is composed of multiple, discrete stages or age classes. Patrick H. Leslie provides an in-depth derivation of the model in his 1945 paper “On the...

## Finding outliers in numerical data

One of the topics emphasized in Exploring Data in Engineering, the Sciences and Medicine is the damage outliers can do to traditional data characterizations.  Consequently, one of the procedures to be included in the ExploringData package is FindOutliers, described in this post.  Given a vector of numeric values, this procedure supports four different methods for identifying possible outliers.Before...

## Some of Excel’s Finance Functions in R

February 16, 2013
Last year I took a free online class on finance by Gautam Kaul. I recommend it, although there are other classes I can not compare it to. The instructor took great efforts in motivating the concepts, structuring the material, and enable critical thinking / intuition. I believe this is an advantage of video...

## digest 0.6.3

February 16, 2013
digest version 0.6.3 is now on CRAN, and I'll upload the Debian package in a minute. This is a minor bug release regarding just the recently-added sha512 support. Turns out the wrong initial buffer size was used on the R side. Hannes fixed that with...

## Google Statistician uses R and other programming tools

February 16, 2013
A great interview on the Simply Statistics blog with Google's Nick Chamandy, Phd in Statistics.  Explains that he mainly uses R among other tools to perform his work at Google.  Also of note is the active data science community within Google ...

February 16, 2013
So my playing around with Haskell goes on. You can follow the progress of the little bootstrap exercise on github. Now it’s gotten to the point where it actually does a bootstrap interval for the mean of a sample. Consider the following R script: 10.31 2.5% 97.5% 9.72475 10.85200 So, that was a simple

## Market Filter Back Test Shiny web application

February 15, 2013
Today, I want to share the Market Filter Back Test application (code at GitHub). This is the forth application in the series of examples (I plan to share 5 examples) that will demonstrate the amazing Shiny framework and Systematic Investor Toolbox to analyze stocks, make back-tests, and create summary reports. The motivation for this series

## Video: Data Mining with R

February 15, 2013
Yesterday's Introduction to R for Data Mining webinar was a record setter, with more than 2000 registrants and more than 700 attending the live session presented by Joe Rickert. If you missed it, I've embedded the video replay below, and Joe's slides (with links to many useful resources) are also available. During the webinar, Joe demoed several examples of...

## Incorporating Preference Construction into the Choice Modeling Process

February 15, 2013
Statistical modeling often begins with the response generation process because data analysis is a combination of mathematics and substantive theory.  It is a theory of how things work that determines how we ought to collect and analyze&n...

## Clustering Loss Development Factors

February 15, 2013
Anytime I get a new hammer, I waste no time in trying to find something to bash with it. Prior to last year, I wouldn’t have known what a cluster was, other than the first half of a slang term used to describe a poor decision-making process. Now I’ve seen it in action a

## Zurich, Feb 2013 – Basic R Course

February 15, 2013
## New Data Scientist role at Lloyd’s

February 15, 2013
Lloyd's of London is looking for a Data Scientist as part of the Analysis team. See Lloyd's career web site for more details.

## FillIn: a function for filling in missing data in one data frame with info from another

February 15, 2013
Sometimes I want to use R to fill in values that are missing in one data frame with values from another. For example, I have data from the World Bank on government deficits. However, there are some country-years with missing data. I gathered data from ...

## Sorting rows and colums in a matrix (with some music, and some magic)

February 14, 2013
This morning, I was working on some paper on inequality measures, and for computational reasons, I had to sort elements in a matrix. To make it simple, I had a rectangular matrix, like the one below, > set.seed(1) > u=sample(1:(nc*nl)) > (M1=matrix(u,nl,nc)) 7 5 11 23 6 17 9 18 1 21...

## January Seasonality Shiny web application

February 14, 2013
Today, I want to share the January Seasonality application (code at GitHub). This example is based on the An Example of Seasonality Analysis post. This is the third application in the series of examples (I plan to share 5 examples) that will demonstrate the amazing Shiny framework and Systematic Investor Toolbox to analyze stocks, make

## Make a Valentine’s Heart with R

February 14, 2013
If you haven't sent your loved one a Valentine's Day greeting yet, it's not too late! Thanks to Guillermo Santos who pointed out an R script from Berkeley's Concepts in Computing with Data course, I created the following Valentine's Day card for my husband: If you want to make one for your loved one, you can use the R...

## GPS Basemaps in R Using get_map

February 14, 2013
There are many different maps you can use for a background map for your gps or other latitude/longitude data (i.e. any time you're using geom_path, geom_segment, or geom_point.)get_mapHelpfully, there's just one function that will allow you to query Google Maps, OpenStreetMap, Stamen maps, or CloudMade maps: get_map in the ggmap package. You could also use either get_googlemap, get_openstreetmap, get_stamenmap, or get_cloudmademap, but...

## Major update to the R-package geomorph

February 14, 2013
Hi Folks,We have just completed a major update to the R-package geomorph: software for geometric morphometric analyses in R.  Included are several new functions to  carry out additional GM analyses, as well as enhancements of existing functio...

## Veterinary Epidemiologic Research: Linear Regression

February 14, 2013
By
$Veterinary Epidemiologic Research: Linear Regression$

This post will describe linear regression as from the book Veterinary Epidemiologic Research, describing the examples provided with R. Regression analysis is used for modeling the relationship between a single variable Y (the outcome, or dependent variable) measured on a continuous or near-continuous scale and one or more predictor (independent or explanatory variable), X. If

## Happy Valentine’s Day @mrshrbrmstr!

February 14, 2013
dat<- data.frame(t=seq(0, 2*pi, by=0.1) ) xhrt <- function(t) 16*sin(t)^3 yhrt <- function(t) 13*cos(t)-5*cos(2*t)-2*cos(3*t)-cos(4*t) dat\$y=yhrt(dat\$t) dat\$x=xhrt(dat\$t) with(dat, polygon(x,y, col="hotpink")) i heaRt you! (R code inspired by/lifted from: DWin on StackOverflow)

## Population simulation leads to Valentine’s Day a[R]t

February 14, 2013
Working on a quick-and-dirty simulation of people wandering around until they find neighbors, then settling down. After playing with the coloring a bit I arrived at the above image, which I quite like. Code below: # Code by Matt Asher for statisticsblog.com # Feel free to modify and redistribute, but please keep this notice