Guide to using the Hmsc package for the production of Joint Species Distribtuion Models

[This article was first published on Numerilab, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This document presents the different steps to follow to produce a Joint Species Distribution Models (JSDM) and more precisely the one presented by Ovaskainen et al (2017) using the Hmsc package. This document is divided into several parts. First, the installation and download of the different packages and files that will be used. Then, it presents the production of the statistical analysis under HMSC. Finally, it presents different tools to extract useful information from the posterior distribution of the model.

Installation and loading of the different packages and files

Installation of the packages

Before starting, it is necessary to install the different packages if they are not already installed. This part may not be done if the packages were already installed previously.

install.packages("Hmsc")
install.packages("snow") In the event that it is not installed with Hmsc
install.packages("corrplot")

Package loading

Here we load the packages, this step has to be done for the proper functioning of the code suite.

library(Hmsc)
## Loading required package: coda
library(corrplot)
## corrplot 0.84 loaded

Import of databases

Here we are going to import the fungi dataset made from a dataset present in the Guillaume Blanchet’s Github necessary to our analysis and that we have to download previously here. The dataset is composed of three csv files. First the main file including the species occurrence and the environmental variables for each sample. Secondly, a file containing the functional traits of each species. Finally, a phylogenetic distance file between each species.

data = read.csv("fungiHMSC.csv")
phylo = read.csv("phyloHMSC.csv")
Tr = read.csv("TrHMSC.csv")

The three databases take the following forms :

head(data)
##   site plot managed distance_to_edge decay diam Aleurodiscus.wakefieldiae
## 1    1    1       1              174     1    3                         0
## 2    1    1       1              174     1    1                         0
## 3    1    1       1              174     1    2                         0
## 4    1    1       1              174     1    8                         0
## 5    1    1       1              174     1    1                         0
## 6    1    1       1              174     1    1                         0
##   Athelopsis.glaucina Bisporella.citrina Ceriporia.reticulata
## 1                   0                  0                    0
## 2                   0                  0                    0
## 3                   0                  0                    0
## 4                   0                  0                    0
## 5                   0                  0                    0
## 6                   0                  0                    0
##   Crustomyces.subabruptus Dacrymyces.stillatus Diatrype.stigma
## 1                       0                    0               0
## 2                       0                    0               0
## 3                       0                    0               0
## 4                       0                    0               0
## 5                       0                    0               0
## 6                       0                    0               0
##   Eutypella.quaternata Fomes.fomentarius Fomitopsis.pinicola
## 1                    0                 0                   0
## 2                    0                 0                   0
## 3                    0                 0                   0
## 4                    0                 0                   0
## 5                    0                 0                   0
## 6                    0                 0                   0
##   Hyphoderma.cremeoalbum Hyphoderma.litschaueri Hyphoderma.roseocremeum
## 1                      0                      0                       0
## 2                      0                      0                       0
## 3                      0                      0                       0
## 4                      0                      0                       0
## 5                      0                      0                       0
## 6                      0                      0                       0
##   Hyphoderma.transiens Hyphodermella.rosae Hyphodontia.sambuci
## 1                    0                   0                   0
## 2                    0                   0                   0
## 3                    0                   0                   0
## 4                    0                   0                   0
## 5                    0                   0                   0
## 6                    0                   0                   0
##   Hypoxylon.fragiforme Hypoxylon.rubiginosum Junghuhnia.nitida
## 1                    0                     0                 0
## 2                    0                     0                 0
## 3                    0                     0                 0
## 4                    0                     0                 0
## 5                    0                     0                 0
## 6                    0                     0                 0
##   Kretzschmaria.deusta Marasmius.alliaceus Melogramma.spiniferum Mycoacia.uda
## 1                    0                   0                     0            0
## 2                    0                   0                     0            0
## 3                    0                   0                     0            0
## 4                    0                   0                     0            0
## 5                    0                   0                     0            0
## 6                    0                   0                     0            0
##   Peniophora.cinerea Phanerochaete.laevis Phanerochaete.sordida
## 1                  1                    0                     0
## 2                  0                    0                     0
## 3                  0                    0                     0
## 4                  0                    0                     0
## 5                  0                    0                     0
## 6                  0                    0                     0
##   Postia.subcaesia Rigidoporus.sanguinolentus Schizopora.flavipora
## 1                0                          0                    0
## 2                0                          0                    0
## 3                0                          0                    0
## 4                0                          0                    0
## 5                0                          0                    0
## 6                0                          0                    0
##   Scopuloides.rimosa Sidera.lenis Sistotrema.brinkmannii Skeletocutis.nivea
## 1                  0            0                      0                  0
## 2                  0            0                      0                  0
## 3                  0            0                      0                  0
## 4                  0            0                      0                  0
## 5                  0            0                      0                  0
## 6                  0            0                      0                  0
##   Stereum.ochraceoflavum Tapesia.fusca Tomentella.botryoides
## 1                      0             0                     0
## 2                      0             0                     0
## 3                      0             0                     0
## 4                      0             0                     0
## 5                      0             0                     0
## 6                      0             0                     0
##   Trametes.versicolor Trichaptum.biforme Tulasnella.violea Xylaria.hypoxylon
## 1                   0                  0                 0                 0
## 2                   0                  0                 0                 0
## 3                   0                  0                 0                 0
## 4                   0                  0                 0                 0
## 5                   0                  0                 0                 0
## 6                   0                  0                 0                 0
head(phylo)
##   Aleurodiscus.wakefieldiae Athelopsis.glaucina Bisporella.citrina
## 1                         1                   0                  0
## 2                         0                   1                  0
## 3                         0                   0                  1
## 4                         0                   0                  0
## 5                         0                   0                  0
## 6                         0                   0                  0
##   Ceriporia.reticulata Crustomyces.subabruptus Dacrymyces.stillatus
## 1            0.0000000               0.0000000                    0
## 2            0.0000000               0.0000000                    0
## 3            0.0000000               0.0000000                    0
## 4            1.0000000               0.3333333                    0
## 5            0.3333333               1.0000000                    0
## 6            0.0000000               0.0000000                    1
##   Diatrype.stigma Eutypella.quaternata Fomes.fomentarius Fomitopsis.pinicola
## 1               0                    0         0.0000000           0.0000000
## 2               0                    0         0.0000000           0.0000000
## 3               0                    0         0.0000000           0.0000000
## 4               0                    0         0.3333333           0.3333333
## 5               0                    0         0.3333333           0.3333333
## 6               0                    0         0.0000000           0.0000000
##   Hyphoderma.cremeoalbum Hyphoderma.litschaueri Hyphoderma.roseocremeum
## 1              0.0000000              0.0000000               0.0000000
## 2              0.0000000              0.0000000               0.0000000
## 3              0.0000000              0.0000000               0.0000000
## 4              0.3333333              0.3333333               0.3333333
## 5              0.3333333              0.3333333               0.3333333
## 6              0.0000000              0.0000000               0.0000000
##   Hyphoderma.transiens Hyphodermella.rosae Hyphodontia.sambuci
## 1            0.0000000           0.0000000                   0
## 2            0.0000000           0.0000000                   0
## 3            0.0000000           0.0000000                   0
## 4            0.3333333           0.6666667                   0
## 5            0.3333333           0.3333333                   0
## 6            0.0000000           0.0000000                   0
##   Hypoxylon.fragiforme Hypoxylon.rubiginosum Junghuhnia.nitida
## 1                    0                     0         0.0000000
## 2                    0                     0         0.0000000
## 3                    0                     0         0.0000000
## 4                    0                     0         0.3333333
## 5                    0                     0         0.3333333
## 6                    0                     0         0.0000000
##   Kretzschmaria.deusta Marasmius.alliaceus Melogramma.spiniferum Mycoacia.uda
## 1                    0                   0                     0    0.0000000
## 2                    0                   0                     0    0.0000000
## 3                    0                   0                     0    0.0000000
## 4                    0                   0                     0    0.3333333
## 5                    0                   0                     0    0.3333333
## 6                    0                   0                     0    0.0000000
##   Peniophora.cinerea Phanerochaete.laevis Phanerochaete.sordida
## 1          0.3333333            0.0000000             0.0000000
## 2          0.0000000            0.0000000             0.0000000
## 3          0.0000000            0.0000000             0.0000000
## 4          0.0000000            0.6666667             0.6666667
## 5          0.0000000            0.3333333             0.3333333
## 6          0.0000000            0.0000000             0.0000000
##   Postia.subcaesia Rigidoporus.sanguinolentus Schizopora.flavipora
## 1        0.0000000                  0.0000000                    0
## 2        0.0000000                  0.0000000                    0
## 3        0.0000000                  0.0000000                    0
## 4        0.3333333                  0.3333333                    0
## 5        0.3333333                  0.3333333                    0
## 6        0.0000000                  0.0000000                    0
##   Scopuloides.rimosa Sidera.lenis Sistotrema.brinkmannii Skeletocutis.nivea
## 1          0.0000000            0                      0          0.0000000
## 2          0.0000000            0                      0          0.0000000
## 3          0.0000000            0                      0          0.0000000
## 4          0.3333333            0                      0          0.3333333
## 5          0.3333333            0                      0          0.3333333
## 6          0.0000000            0                      0          0.0000000
##   Stereum.ochraceoflavum Tapesia.fusca Tomentella.botryoides
## 1              0.6666667     0.0000000                     0
## 2              0.0000000     0.0000000                     0
## 3              0.0000000     0.6666667                     0
## 4              0.0000000     0.0000000                     0
## 5              0.0000000     0.0000000                     0
## 6              0.0000000     0.0000000                     0
##   Trametes.versicolor Trichaptum.biforme Tulasnella.violea Xylaria.hypoxylon
## 1           0.0000000          0.0000000                 0                 0
## 2           0.0000000          0.0000000                 0                 0
## 3           0.0000000          0.0000000                 0                 0
## 4           0.3333333          0.3333333                 0                 0
## 5           0.3333333          0.3333333                 0                 0
## 6           0.0000000          0.0000000                 0                 0
head(Tr)
##                     species spore_volume ornamented_spores thick_spores
## 1 Aleurodiscus.wakefieldiae   6936.00000                 1            1
## 2       Athelopsis.glaucina     48.09375                 0            0
## 3        Bisporella.citrina    108.90000                 0            0
## 4      Ceriporia.reticulata     66.17188                 0            0
## 5   Crustomyces.subabruptus     20.25000                 0            0
## 6      Dacrymyces.stillatus    468.87500                 0            0
##   cystidia_or_paraphyses
## 1                      1
## 2                      1
## 3                      1
## 4                      1
## 5                      1
## 6                      1

Production of the statistical analysis

Transformation and assembly of the boxes that will be introduced into the model

Environmental features

Here we select the environmental variables, present in the main database, which will be integrated into the analysis. Then we transform them and create an “X box” containing its transformed variables. Finally, we create an XFormula object giving the equation that will be used, here we create additive effects but it is possible to create for example interactions between the environmental variables in the XFormula object.

cov = data[,c(3,4,5,6)]

cov$logDist = log(cov$distance_to_edge)
cov$logdiam = log(cov$diam)
cov$decay2 = cov$decay^2


X = as.data.frame(cov[,c("managed","logDist","logdiam","decay","decay2" )])
XFormula = ~managed + logDist + logdiam+ decay + decay2

Species occurrence box

This box contains the occurrences of the species included in the analysis. This corresponds to columns 7 to 46 of the database.

Y = as.matrix(data[,7:46]) 

Study design

This box contains the variables representing the structure of the analysis, such as the year or sampling site.

# Variable selection 
studyDesign = data[,c("site","plot")]
studyDesign = data.frame(apply(studyDesign,2,as.factor))

# Variable structuring 
site = HmscRandomLevel(units = studyDesign$site)
plot = HmscRandomLevel(units = studyDesign$plot)
ranlevels = list(site = site, plot = plot) 
ranlevels
## $site
## Hmsc random level object with 6216 units. Spatial dimensionality is 0 and number of covariates is 0.
## 
## $plot
## Hmsc random level object with 6216 units. Spatial dimensionality is 0 and number of covariates is 0.

Phylogenetic matrix

Transformation of the phylogenetic database into a matrix.

## Matrice contenant la phylogénie

C = as.matrix(phylo)

Traits matrix

Transformation of the traits database into a matrix and creation of a TrFormula object of the same type as XFormula seen earlier.

colnames(Tr)
## [1] "species"                "spore_volume"           "ornamented_spores"     
## [4] "thick_spores"           "cystidia_or_paraphyses"
Tr = as.data.frame(Tr[,-1])

TrFormula = ~spore_volume + 
  ornamented_spores +
  thick_spores +
  cystidia_or_paraphyses

Creating and running the model

In this part we will create the model in itself from the different boxes previously created.Here we will see the model containing the environmental, phylogenetic and specific features data. However, it is possible not to integrate the phylogenetic matrix or the specific traits, by not specifying their boxes.

Creating the model

This part consists in assembling the previously produced boxes containing the data and the structure of the model (XFormulta TrFormula).

simul <- Hmsc(Y=Y, XData = X,
              XFormula = XFormula,
              TrData = Tr, 
              TrFormula = TrFormula , 
              C = C,
              studyDesign = studyDesign,
              ranLevels  = ranlevels,
              distr = "probit")  

Run the model

Before running the model we specify the thin, the number of samples, the transient and the number of chains
Here for a quick test we specify a small set that does not allow a correct analysis.
We then integrate the specified elements in the sampleMcmc function to create a final Hmsc object containing the model information and the posterior distribution of the different coefficients. Finally, we save this model which will allow us to load it later to work on the posterior distributions, for example to create figures.

thin = 1
samples = 50
nChains = 2
transient = 50

mod_HMSC = sampleMcmc(simul,
                      samples = samples,
                      thin = thin,
                      transient = transient,
                      nChains = nChains, 
                      nParallel = nChains)



save(mod_HMSC,file="mod_HMSC.Rdata")

Main results and graphs

In this last part we will see how to use posterior distributions to make graphs to answer ecological questions such as the impact of the environment on the species occurrence for example.
Here we will work on the same model that we have created above, but which has been previously run with a total size of 10,000 and 3 chains, which will allow us to make the graphs and other results without having any problems due to the lack of information that could generate the model above. However this length will not be enough for a good convergence of the chains and therefore in the framework of a scientific publication it will be necessary to run the model for higher values of thin, samples and transient until the convergence of the chains.

Loading and verification of the convergence of MCMC chains

To start we will load the template contained in the files uploaded at the beginning of this document.
Then we are going to perform convergence tests of the MCMC chains. Here we look at the first 5 chains for beta and gamma in order not to overload this document but a visualization of all the channels allows to better account for the convergence.

## Load the model
load("mod_HMSC10k.Rdata")


## Convergence tests

mcoda <- convertToCodaObject(mod_HMSC)
par(mar = rep(2, 4))

#Visual chain tests for different coefficients of interest 

plot(mcoda$Beta[,1:5])

plot(mcoda$Rho)

plot(mcoda$Gamma[,1:5])

# Gelman's diagnosis, which should be at most close to 1.0 for good convergence.
gelman.diag(mcoda$Beta[,1:50])
## Potential scale reduction factors:
## 
##                                                     Point est. Upper C.I.
## B[(Intercept) (C1), Aleurodiscus.wakefieldiae (S1)]     231.46    1266.34
## B[managed (C2), Aleurodiscus.wakefieldiae (S1)]         193.48     492.45
## B[logDist (C3), Aleurodiscus.wakefieldiae (S1)]          78.67     289.26
## B[logdiam (C4), Aleurodiscus.wakefieldiae (S1)]           4.95      24.01
## B[decay (C5), Aleurodiscus.wakefieldiae (S1)]           618.34    2763.83
## B[decay2 (C6), Aleurodiscus.wakefieldiae (S1)]          654.05    2960.84
## B[(Intercept) (C1), Athelopsis.glaucina (S2)]            24.46      82.86
## B[managed (C2), Athelopsis.glaucina (S2)]                 1.38       3.72
## B[logDist (C3), Athelopsis.glaucina (S2)]                 1.56       3.12
## B[logdiam (C4), Athelopsis.glaucina (S2)]                 1.54       2.40
## B[decay (C5), Athelopsis.glaucina (S2)]                   1.40       2.05
## B[decay2 (C6), Athelopsis.glaucina (S2)]                  1.35       1.93
## B[(Intercept) (C1), Bisporella.citrina (S3)]             13.45      71.44
## B[managed (C2), Bisporella.citrina (S3)]                  4.45      25.71
## B[logDist (C3), Bisporella.citrina (S3)]                  3.47       6.69
## B[logdiam (C4), Bisporella.citrina (S3)]                  3.10       5.73
## B[decay (C5), Bisporella.citrina (S3)]                    1.04       1.13
## B[decay2 (C6), Bisporella.citrina (S3)]                   1.04       1.13
## B[(Intercept) (C1), Ceriporia.reticulata (S4)]            5.78      34.56
## B[managed (C2), Ceriporia.reticulata (S4)]                2.02      10.05
## B[logDist (C3), Ceriporia.reticulata (S4)]                1.24       1.73
## B[logdiam (C4), Ceriporia.reticulata (S4)]                1.14       1.42
## B[decay (C5), Ceriporia.reticulata (S4)]                  1.01       1.03
## B[decay2 (C6), Ceriporia.reticulata (S4)]                 1.01       1.04
## B[(Intercept) (C1), Crustomyces.subabruptus (S5)]        38.14     160.71
## B[managed (C2), Crustomyces.subabruptus (S5)]            13.06      77.52
## B[logDist (C3), Crustomyces.subabruptus (S5)]             3.42       7.49
## B[logdiam (C4), Crustomyces.subabruptus (S5)]             2.43       5.55
## B[decay (C5), Crustomyces.subabruptus (S5)]               3.05      11.62
## B[decay2 (C6), Crustomyces.subabruptus (S5)]              2.97      10.97
## B[(Intercept) (C1), Dacrymyces.stillatus (S6)]           25.73      67.60
## B[managed (C2), Dacrymyces.stillatus (S6)]               87.59     330.53
## B[logDist (C3), Dacrymyces.stillatus (S6)]               10.84      26.50
## B[logdiam (C4), Dacrymyces.stillatus (S6)]                3.95      20.69
## B[decay (C5), Dacrymyces.stillatus (S6)]                  2.24      10.88
## B[decay2 (C6), Dacrymyces.stillatus (S6)]                 2.26      11.32
## B[(Intercept) (C1), Diatrype.stigma (S7)]                 6.15      17.00
## B[managed (C2), Diatrype.stigma (S7)]                    10.42      64.20
## B[logDist (C3), Diatrype.stigma (S7)]                     1.11       1.36
## B[logdiam (C4), Diatrype.stigma (S7)]                     1.12       1.37
## B[decay (C5), Diatrype.stigma (S7)]                       5.14      29.76
## B[decay2 (C6), Diatrype.stigma (S7)]                      5.15      29.94
## B[(Intercept) (C1), Eutypella.quaternata (S8)]          192.53     526.55
## B[managed (C2), Eutypella.quaternata (S8)]               28.34     164.96
## B[logDist (C3), Eutypella.quaternata (S8)]                4.31      24.96
## B[logdiam (C4), Eutypella.quaternata (S8)]                2.40       4.22
## B[decay (C5), Eutypella.quaternata (S8)]                511.74    1035.02
## B[decay2 (C6), Eutypella.quaternata (S8)]               556.69    1135.55
## B[(Intercept) (C1), Fomes.fomentarius (S9)]              14.73      42.16
## B[managed (C2), Fomes.fomentarius (S9)]                  18.95      72.95
## 
## Multivariate psrf
## 
## 17595

Beta

One of the questions that is asked when using a JSDM is how the environment influences the species being studied. This influence is quantified by Beta coefficients that can be viewed using the getPostEstimate and plotBeta functions. Moreover, in the plotBeta function, we can visualize either the means of the posterior distribution of the Beta coefficient or a support level (by default).

postBeta = getPostEstimate(mod_HMSC, parName = "Beta")

par(mar=c(5,11,2.5,0))

plotBeta(mod_HMSC,
         post = postBeta, 
         plotTree = F,
         spNamesNumbers = c(T,F))

plotBeta(mod_HMSC, 
         post = postBeta,
         param = "Mean",
         plotTree = F,  
         spNamesNumbers = c(T,F))

Gamma

The other questions asked when using a JSDM is how the species traits influence the species response to the environment. This influence is quantified by Gamma coefficients that we can visualize using the getPostEstimate and plotGamma functions. Moreover as the plotBeta, we can visualize either the means of the posterior distribution of the coefficient Beta or a support level (by default).

par(mar=c(5,11,2.5,0))

postGamma = getPostEstimate(mod_HMSC, parName = "Gamma")


plotGamma(mod_HMSC, post = postGamma)
## Warning in plotGamma(mod_HMSC, post = postGamma): Nothing to plot at this level
## of posterior support

As you can see on the warning message, the support level here does not show any correlation between the response to the environment and the different features. This may be due to the problem of model convergence. To “force” the thing we will decrease the support level.

par(mar=c(5,11,2.5,0))

postGamma = getPostEstimate(mod_HMSC, parName = "Gamma")


plotGamma(mod_HMSC, post = postGamma, supportLevel = 0.2)

Variance partitionning

Another interesting thing we can look at from the posterior distributions is how much each environmental and study design variable influences the species of interest. This can be looked at using the Variance Partitioning functions.

VP = computeVariancePartitioning(mod_HMSC)

par(mar=c(4,4,4,4))

plotVariancePartitioning(mod_HMSC, VP = VP,
                         las = 2, horiz=F)

Predictions

It is interesting to know for example how species are influenced by the environment, as can be seen from the Beta coefficients. However, it is difficult to visualize shape of the influence of a variable, for example. To visualize this we can use a tool present in Hmsc, this tool allows to build an environmental gradient and then see how species are influenced by this gradient. This tool allows to plot predictions according to the gradient, for species richness, a single species or even for a species trait.

Gradient = constructGradient(mod_HMSC,
                             focalVariable = "logdiam")

head(Gradient$XDataNew)
##      logdiam  managed  logDist     decay     decay2
## 1 -2.3025851 1.308102 3.900249 0.1680480 -2.5181826
## 2 -1.9176788 1.247936 4.017576 0.3298778 -1.8535490
## 3 -1.5327724 1.187771 4.134902 0.4917077 -1.1889154
## 4 -1.1478661 1.127605 4.252228 0.6535376 -0.5242818
## 5 -0.7629597 1.067440 4.369554 0.8153675  0.1403518
## 6 -0.3780534 1.007274 4.486880 0.9771974  0.8049854
predY = predict(mod_HMSC,
                XData = Gradient$XDataNew, 
                studyDesign = Gradient$studyDesignNew,
                ranLevels = Gradient$rLNew,
                expected = TRUE)

### Species richness

plotGradient(mod_HMSC,
             Gradient,
             pred=predY,
             measure="S",
             showData = TRUE)

## [1] 0.4166667
### Single species (ex: species 35)

plotGradient(mod_HMSC,
             Gradient,
             pred=predY,
             measure="Y",
             index = 35,
             showData = TRUE)

## [1] 0.1666667
### Species traits


plotGradient(mod_HMSC, 
             Gradient,
             pred=predY,
             measure="T",
             index = 2,
             showData = TRUE)

## [1] 0.4326667

To leave a comment for the author, please follow the link and comment on their blog: Numerilab.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)