Better Neighborhoods with R: Exploring and Analyzing SeeClickFix Data (part 1) The National Day of Civic Hacking took place …Continue reading »

When managing big data with R, many people like to use sqldf() package due to its friendly interface or choose data.table() package for its lightening speed. However, very few would pay special attentions to small details that might significantly boost the efficiency of these packages by adding index to the data.frame or data.table. In my

Following up on my previous post regarding attrition rates at Comrades Marathon 2013, here are the statistics I have gathered for medal allocations. There is some interesting history behind the Comrades Marathon medals. For reference, the medals are allocated as follows: Gold medals to the first ten finishers in the men’s race and the ladies’ race;

Introduction Recently, I began a series on exploratory data analysis; so far, I have written about computing descriptive statistics and creating box plots in R for a univariate data set with missing values. Today, I will continue this series by analyzing the same data set with kernel density estimation, a useful non-parametric technique for visualizing

The measures of position such as quartiles, deciles, and percentiles are available in quantile function. This function has a usage,where:x - the data pointsprob - the location to measurena.rm - if FALSE, NA (Not Available) data points are not ignoredna...

In my post on 06/05/2013 (http://statcompute.wordpress.com/2013/06/05/estimating-composite-models-for-count-outcomes-with-fmm-procedure), I’ve shown how to estimate finite mixture models, e.g. zero-inflated Poisson and 2-class finite mixture Poisson models, with FMM and NLMIXED procedure in SAS. Today, I am going to demonstrate how to achieve the same results with flexmix package in R. R Code R Output for 2-Class Finite Mixture

Sometimes I just want to quickly make a simple D3 JavaScript directed network graph with data in R. Because D3 network graphs can be manipulated in the browser–i.e. nodes can be moved around and highlighted–they're really nice for data exploration. They're also really nice in HTML presentations. So I put together a...

Mean in R is computed using the function mean. Consider the scores of 20 MSU-IIT students in Stat 101 exam with a hundred items: 70, 78, 66, 65, 50, 53, 48, 88, 95, 80, 85, 84, 81, 63, 68, 73, 75, 84, 49, and 77. Compute and interpret the mean and medi...