**Statisfaction » R**, and kindly contributed to R-bloggers)

For a quick recap, Pierre and I supervised a team project at Ensae last year, on a statistical critique of the abstract painting *1024 Colours* by painter Gerhard Richter. The four students, Clémence Bonniot, Anne Degrave, Guillaume Roussellet and Astrid Tricaud, did an outstanding job. Here is a selection of graphs and results they produced.

1. As a preliminary descriptive study, the following **scatter plots** come and complete the triangle plot.The R function , from the package of the same name, displays the pixels with their coordinates in the RGB cube. It shows that, as a joint law, the triplets are somehow concentrated along the *black-white* diagonal of the cube.

The same occurs when the points are projected on the sides of the cube. Here is a comparison with uniform simulations.

2. It is interesting to see what happens in other color representations. **HSL** and **HSV** are two cylindrical models, succintly described by this Wikimedia picture:

The points parameterized in these model were projected on the sides as well; here, the sides of the cylinder are to be seen as the circular top (or bottom), the lateral side, and the section of the cylinder by a half-plane from its axis. Its shows that some colors in the *hue* coordinate (rainbow-like colors) are missing, for instance green or purple.

For the HSL model,

and the HSV model.

The histograms complete this first analysis. For HSL,

and HSV.

3. The students built a few ad-hoc **tests for uniformity**, either following our perspective or on their own. They used a Kolmogorov-Smirnov test, a test, and some entropy based tests.

4. We eventually turned to testing for **spatial autocorrelation**. In other words, is the color of one cell related to the color of its neighbors (in which case you can predict the neighbors’ colors given one cell), or is it “non informative”? A graphical way to check this is to plot average level of a color coordinate of the neighbors of a pixel with respect to its own coordinate. Then to fit a simple regression on this cloud: if the slope of the regression line is non zero, then there is some correlation, of the sign of the slope. We tried various combinations of coordinates, and different radii for the neighborhood’s definition, with no significant correlation. A (so-called Moran) test quantified this assessment. Here is for example the plot of the average level of red of the eight closer neighbors of each pixel with respect to its level of red.

*Thanks again to the students for their enthusiasm!*

**leave a comment**for the author, please follow the link and comment on their blog:

**Statisfaction » R**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...