Tips & Tricks 7: Plotting PCA with TPS grids

March 19, 2015
By

(This article was first published on geomorph, and kindly contributed to R-bloggers)

Geomorph users,

Our function plotTangentSpace() performs a Principal Components Analysis (PCA) of shape variation and plots two dimensions of tangent space for a set of Procrustes-aligned specimens and also returns the shape changes associated with the two plotted principal axes. We purposefully restricted the options for this function because plotting in R has almost endless possibilities. That is why we added the ‘verbose=TRUE‘ option, so that the pc scores and pc shapes could be plotted on their own. And users are of course expected to do this for their publications and reports.

In this week’s exercise, we explore a few options to do advance of plotting PCAs and TPS grids with geomorph and R base functions. 

Exercise 7 – Plotting Principal Components Analysis Graphs with TPS grids.


Publications of geometric morphometric data usually have at least one figure showing the results of a PCA and accompanying shape changes for the plotted axes. For example:

To make this graph:

Let’s use the plethodon data example
library(geomorph)
data(plethodon) 
Y.gpa<-gpagen(plethodon$land)    #GPA-alignment

We’ll make a factor out of both the species and site classifiers, creating a factor with four levels:
gp <- as.factor(paste(plethodon$species, plethodon$site))
 [1] Jord Symp  Jord Symp  Jord Symp  Jord Symp  Jord Symp  Jord Symp  Jord Symp  Jord Symp  Jord Symp 
[10] Jord Symp  Teyah Symp Teyah Symp Teyah Symp Teyah Symp Teyah Symp Teyah Symp Teyah Symp Teyah Symp
[19] Teyah Symp Teyah Symp Jord Allo  Jord Allo  Jord Allo  Jord Allo  Jord Allo  Jord Allo  Jord Allo 
[28] Jord Allo  Jord Allo  Jord Allo  Teyah Allo Teyah Allo Teyah Allo Teyah Allo Teyah Allo Teyah Allo
[37] Teyah Allo Teyah Allo Teyah Allo Teyah Allo
Levels: Jord Allo Jord Symp Teyah Allo Teyah Symp

Let’s create a colour vector to colour the groups, e.g. using rainbow()
col.gp <- rainbow(length(levels(gp))) # generates a set of different colour over the rainbow spectrum
[1] “#FF0000FF” “#80FF00FF” “#00FFFFFF” “#8000FFFF”
You can also assign the names by hand using HEX codes or names of colours. 

This vector needs to have dimension labels
names(col.gp) <- levels(gp)


Using match()we can generates a vector of length(n) assigning a colour to each specimen
col.gp <- col.gp[match(gp, names(col.gp))]
  Jord Symp   Jord Symp   Jord Symp   Jord Symp   Jord Symp   Jord Symp   Jord Symp   Jord Symp 
“#80FF00FF” “#80FF00FF” “#80FF00FF” “#80FF00FF” “#80FF00FF” “#80FF00FF” “#80FF00FF” “#80FF00FF” 
  Jord Symp   Jord Symp  Teyah Symp  Teyah Symp  Teyah Symp  Teyah Symp  Teyah Symp  Teyah Symp 
“#80FF00FF” “#80FF00FF” “#8000FFFF” “#8000FFFF” “#8000FFFF” “#8000FFFF” “#8000FFFF” “#8000FFFF” 
 Teyah Symp  Teyah Symp  Teyah Symp  Teyah Symp   Jord Allo   Jord Allo   Jord Allo   Jord Allo 
“#8000FFFF” “#8000FFFF” “#8000FFFF” “#8000FFFF” “#FF0000FF” “#FF0000FF” “#FF0000FF” “#FF0000FF” 
  Jord Allo   Jord Allo   Jord Allo   Jord Allo   Jord Allo   Jord Allo  Teyah Allo  Teyah Allo 
“#FF0000FF” “#FF0000FF” “#FF0000FF” “#FF0000FF” “#FF0000FF” “#FF0000FF” “#00FFFFFF” “#00FFFFFF” 
 Teyah Allo  Teyah Allo  Teyah Allo  Teyah Allo  Teyah Allo  Teyah Allo  Teyah Allo  Teyah Allo 

“#00FFFFFF” “#00FFFFFF” “#00FFFFFF” “#00FFFFFF” “#00FFFFFF” “#00FFFFFF” “#00FFFFFF” “#00FFFFFF” 

run the plotTangentSpace function and save the results to PCA object
PCA <- plotTangentSpace(Y.gpa$coords, verbose = T)

PCA is a list containing the pc.summary
PCA$pc.summary$importance
                              PC1        PC2        PC3        PC4        PC5        PC6        PC7
Standard deviation     0.04307484 0.03958025 0.02034959 0.01509451 0.01313772 0.01293281 0.01244592
Proportion of Variance 0.36743000 0.31023000 0.08201000 0.04512000 0.03418000 0.03312000 0.03068000

Cumulative Proportion  0.36743000 0.67767000 0.75967000 0.80479000 0.83897000 0.87209000 0.90277000

which can be used to make the new axis labels
xlab <- paste(“Principal Component 1 “, “(“, round(PCA$pc.summary$importance[2,1]*100, 1), “%)”, sep=””)
ylab <- paste(“Principal Component 2 “, “(“, round(PCA$pc.summary$importance[2,2]*100, 1), “%)”, sep=””)

We will use the advanced plotting functions layout() and par() to build up this multipart figure
mat <- matrix(c(4,5,0,1,1,2,1,1,3), 3)
     [,1] [,2] [,3]
[1,]    4    1    1
[2,]    5    1    1
[3,]    0    2    3

With this function, we have divided up the plotting window into a 3×3 grid, and the numbers correspond to the order and location of each item being plotted

layout(mat, widths=c(1,1,1), heights=c(1,1,0.6))# set the size of the rows and columns

# Item 1 to plot, the graph of PC1 vs PC2
par(mar=c(4, 4, 1, 1)) # sets the margins

PCA also contains the pc scores in PCA$pc.scores. Here we take the first two columns for PC1 and PC2
plot(PCA$pc.scores[,1], PCA$pc.scores[,2], pch=21, cex=2, bg=col.gp, xlab=xlab, ylab=ylab, asp=T)
legend(-0.09, 0.07, legend= unique(gp), pch=19,  col=unique(col.gp))

Finally we add the TPS grids to demonstrate the shape changes along each axis. PCA contains the shape matrices PCA$pc.shapes.
ref <- mshape(Y.gpa$coords)# assign mean shape for use with plotRefToTarget below

# Item 2 to plot, the first TPS grid; here we use the outline option to add to the visualisation
par(mar = c(0,0,0,0)) # sets the margins
plotRefToTarget(ref,PCA$pc.shapes$PC1min,outline=plethodon$outline)
# Item 3
plotRefToTarget(ref,PCA$pc.shapes$PC1max,outline=plethodon$outline)
# Item 4
plotRefToTarget(ref,PCA$pc.shapes$PC2min,outline=plethodon$outline)
# Item 5
plotRefToTarget(ref,PCA$pc.shapes$PC2max,outline=plethodon$outline)


Resources to learn more about plotting in R:
https://stat.ethz.ch/R-manual/R-devel/library/graphics/html/plot.html
http://seananderson.ca/courses/11-multipanel/multipanel.pdf
http://www.statmethods.net/advgraphs/parameters.html


To leave a comment for the author, please follow the link and comment on their blog: geomorph.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)