In the third and last of the ggplot series, this post will go over interesting ways to visualize the distribution of your data.

Data analysis deals with different kinds of data. For instance we can have supermarket sales with - a transactional table, with customer ID, item ID, date of purchase - an item table, with the item ID and its price - … Continue reading →

Self-Organising Maps (SOMs) are an unsupervised data visualisation technique that can be used to visualise high-dimensional data sets in lower (typically 2) dimensional representations. In this post, we examine the use of R to create a SOM for customer segmentation. The figures shown here used use the 2011 Irish Census information for the greater Dublin

by Joseph Rickert When I was in graduate school in the mid '70s Mathematics departments were still under the spell of abstraction for its own sake. At that time, Algebraic Topology which uses concepts from Abstract Algebra to study topological spaces was a major gateway to the realm of abstraction. On my first visit, it was not at all...

As I have described before, Linear Discriminant Analysis (LDA) can be seen from two different angles. The first classify a given sample of predictors to the class with highest posterior probability . It minimizes the total probability of misclassification. To compute it uses Bayes’ rule and assume that follows a Gaussian distribution with class-specific mean

Continuing on the theme of solar angles, the code given below produces an analemma diagram similar to that of Lynch (2012, figure 2). 1 2 3 4 5 6 7 8 9 10 library(oce) loc <- list(lon=-0.0015, lat=51.4778) # Greenwich Observatory t <- seq.POSIXt(as.POSIXct("2014-01-01 12:00:00", tz="UTC"), as.POSIXct("2015-01-01...