# Posts Tagged ‘ Exploratory Data Analysis ’

## "My interpretation of [Leland Wilkinson’s] grammar [of statistical graphics]: —Data is the most…"

August 25, 2011
By “My interpretation of grammar : —Data is the most important thing, and the thing that you bring to the table. —Geometric objects … what you actually see on the plot: points, lines, polygons, etc. ...

## Plotting Time Series data using ggplot2

September 30, 2010
By There are various ways to plot data that is represented by a time series in R. The ggplot2 package has scales that can handle dates reasonably easily. Fast Tube by Casper As an example consider a data set on the number of views of the you tube channel ramstatvid. A short snippet of the data is shown

## Charting the performance of cricket all-rounders – IT Botham

August 16, 2010
By Cricket is a sport that generates a large volume of performance data and corresponding debate about the relative qualities of various players over their careers and in relation to their contemporaries. The cricinfo website has an extensive database of statistics for professional cricketers that can be searched to access the information in various formats. As an

## Displaying data using level plots

May 3, 2010
By A level plot is a type of graph that is used to display a surface in two rather than three dimensions – the surface is viewed from above as if we were looking straight down and is an alternative to a contour plot – geographic data is an example of where this type of graph

## Summarising data using box and whisker plots

April 25, 2010
By A box and whisker plot is a type of graphical display that can be used to summarise a set of data based on the five number summary of this data. The summary statistics used to create a box and whisker plot are the median of the data, the lower and upper quartiles (25% and 75%)

## R and Tolerance Intervals

April 19, 2010
By

Confidence intervals and prediction intervals are used by statisticians on a regular basis. Another useful interval is the tolerance interval that describes the range of values for a distribution with confidence limits calculated to a particular percentile of the distribution. The R package tolerance can be used to create a variety of tolerance intervals of

## Summarising data using scatter plots

April 18, 2010
By A scatter plot is a graph used to investigate the relationship between two variables in a data set. The x and y axes are used for the values of the two variables and a symbol on the graph represents the combination for each pair of values in the data set. This type of graph is

## Summarising data using histograms

April 11, 2010
By The histogram is a standard type of graphic used to summarise univariate data where the range of values in the data set is divided into regions and a bar (usually vertical) is plotted in each of these regions with height proportional to the frequency of observations in that region. In some cases the proportion of

## Social Media Analytics Research Toolkit ([email protected]) Is Moving Into Private Beta

March 31, 2010
By

Download "Getting Started with the Social Media Analytics Research Toolkit" (pdf, 1.25 megabytes) Download the Social Media Analytics Research Toolkit My Social Media Analytics Research Toolkit is about to move into private beta. What's in the release?...

## Summarising data using dot plots

March 26, 2010
By A dot plot is a type of display that compares counts, frequencies, totals or other summary measures for a series of categories. The dot plot can be arranged with the categories either on the vertical or horizontal axis of the display to allow comparising between the different categories as well as comparison within categories where