Posts Tagged ‘ tutorial ’

Displaying data using level plots

May 3, 2010
By
Displaying data using level plots

A level plot is a type of graph that is used to display a surface in two rather than three dimensions – the surface is viewed from above as if we were looking straight down and is an alternative to a contour plot – geographic data is an example of where this type of graph

Read more »

Summarising data using box and whisker plots

April 25, 2010
By
Summarising data using box and whisker plots

A box and whisker plot is a type of graphical display that can be used to summarise a set of data based on the five number summary of this data. The summary statistics used to create a box and whisker plot are the median of the data, the lower and upper quartiles (25% and 75%)

Read more »

Repeated measures ANOVA with R (tutorials)

April 13, 2010
By

Repeated measures ANOVA is a common task for the data analyst. There are (at least) two ways of performing “repeated measures ANOVA” using R but none is really trivial, and each way has it’s own complication/pitfalls (explanation/solution to which I was usually able to find through searching in the R-help mailing list). So for future reference, I am starting this page...

Read more »

Correlation scatter-plot matrix for ordered-categorical data

April 7, 2010
By
Correlation scatter-plot matrix for ordered-categorical data

When analyzing a questionnaire, one often wants to view the correlation between two or more Likert questionnaire item’s (for example: two ordered categorical vectors ranging from 1 to 5). When dealing with several such Likert variable’s, a clear presentation of all the pairwise relation’s between our variable can be achieved by inspecting the (Spearman) correlation matrix (easily achieved in R...

Read more »

Validating credit card numbers in SAS

March 16, 2010
By
Validating credit card numbers in SAS

Major credit card issuing networks (including Visa, MasterCard, Discover, and American Express) allow simple credit card number validation using the Luhn Algorithm (also called the “modulus 10″ or “mod 10″ algorithm). The following code demonstrates an implementation in SAS. The code also validates the credit card number by length and by checking against a short

Read more »

Weighting model fit with ctree in party

March 15, 2010
By
Weighting model fit with ctree in party

Conditional inference trees (ctree) in package party allows weighting which is useful when one classification outcome is more important than another. Useful examples are not difficult to imagine: in a marketing direct mailing, a false positive (non-res...

Read more »

A nice link: “Some hints for the R beginner”

March 7, 2010
By

Patrick Burns just posted to the mailing list the following massage: There is now a document called “Some hints for the R beginner” whose purpose is to get people up and running with R as quickly as possible. Direct access to it is: http://www.burns-stat.com/pages/Tutor/hints_R_begin.html JRR Tolkien wrote a story (sans hobbits) called ‘Leaf by Niggle’ that has always resonated with me. I...

Read more »

Responding to the Flowingdata GDP Graph Challenge

February 25, 2010
By
Responding to the Flowingdata GDP Graph Challenge

Nathan Yau of Flowingdata put up a challenge earlier today to improve upon a graph showing government spending as a percentage of GDP, published in the Economist. The underlying data wasn’t available. So I put on my graph-to-numbers glasses on and pulled out some data. Here it is in case you want to have a

Read more »

The R type system

February 21, 2010
By
The R type system

R is a weird beast. Through it's ancestor the S language, it claims a proud heritage reaching back to Bell Labs in the 1970's when S was created as an interactive wrapper around a set of statistical and numerical subroutines. As a programming language,...

Read more »

Linux Server Profiling: Using Open Source Tools For Bottleneck Analysis

February 9, 2010
By

This tutorial covers profiling of Linux servers using open-source tools such as "iostat", "oprofile" and "blktrace". Both processor-bound and I/O-bound cases are covered, and the emphasis is on tools that provide visual displays of relevant metrics. Li...

Read more »