## Exploratory Data Analysis: Kernel Density Estimation in R on Ozone Pollution Data in New York and Ozonopolis

Introduction Recently, I began a series on exploratory data analysis; so far, I have written about computing descriptive statistics and creating box plots in R for a univariate data set with missing values.  Today, I will continue this series by analyzing the same data set with kernel density estimation, a useful non-parametric technique for visualizing

## Quartiles, Deciles, and Percentiles

June 9, 2013
The measures of position such as quartiles, deciles, and percentiles are available in quantile function. This function has a usage,where:x - the data pointsprob - the location to measurena.rm - if FALSE, NA (Not Available) data points are not ignoredna...

## Estimating Finite Mixture Models with Flexmix Package

June 9, 2013
In my post on 06/05/2013 (http://statcompute.wordpress.com/2013/06/05/estimating-composite-models-for-count-outcomes-with-fmm-procedure), I’ve shown how to estimate finite mixture models, e.g. zero-inflated Poisson and 2-class finite mixture Poisson models, with FMM and NLMIXED procedure in SAS. Today, I am going to demonstrate how to achieve the same results with flexmix package in R. R Code R Output for 2-Class Finite Mixture

## Quick and Simple D3 Network Graphs from R

June 8, 2013
Sometimes I just want to quickly make a simple D3 JavaScript directed network graph with data in R. Because D3 network graphs can be manipulated in the browser–i.e. nodes can be moved around and highlighted–they're really nice for data exploration. They're also really nice in HTML presentations. So I put together a...

## Mean and Median

June 8, 2013
Mean in R is computed using the function mean. Consider the scores of 20 MSU-IIT students in Stat 101 exam with a hundred items: 70, 78, 66, 65, 50, 53, 48, 88, 95, 80, 85, 84, 81, 63, 68, 73, 75, 84, 49, and 77. Compute and interpret the mean and medi...

## Using Metadata to find Paul Revere

June 8, 2013
London, 1772. I have been asked by my superiors to give a brief demonstration of the surprising effectiveness of even the simplest techniques of the new-fangled Social Networke Analysis in the pursuit of those who would seek to undermine the liberty enjoyed by His Majesty’s subjects. This is in connection with the discussion of the role of “metadata” in

## Bulk search for domain names using R

June 8, 2013
# There are several domain name servers that allow # for bulk searching of domain names.# http://www.godaddy.com/bulk-domain-search.aspx# http://www.namestation.com/bulk-domain-search# However, they do not provide any wildcard support # and instead exp...

## Matrix Operations

June 8, 2013
Matrix manipulation in R are very useful in Linear Algebra. Below are lists of common yet important functions in dealing operations with matrices:Transpose - tMultiplication - %*%Determinant - detInverse - solve, or ginv of MASS libraryEigenvalues and ...

## R and MongoDB

June 7, 2013
MongoDB is a document-based noSQL database. Different from the relational database storing data in tables with rigid schemas, MongoDB stores data in documents with dynamic schemas. In the demonstration below, I am going to show how to extract data from a MongoDB with R. Before starting the R session, we need to install the MongoDB