(This article was first published on

**Gregor Gorjanc (gg)**, and kindly contributed to R-bloggers)While reading UseR conference abstracts I came across this sentence: “Sugarcane is polypoid, i.e., has 8 to 14 copies of every chromosome, with individual alleles in varying numbers.” Vau! This generates really complex genotype system. Say we have biallelic gene with alleles being A and B. In diploids the possible genotypes are AA, AB, and BB. Given the above sentence in sugarcane possible genotypes are any permutation of A’s and B’s in a series of 8 to 14 alleles. I am not sure if 9, 11, and 13 are also allowed, that is having even number of chromosomes. In any case such permutations result in really large numbers!

Thinking about this a bit further it appears that the whole system is not that complex once we realize that genotyping does not tell as about the order of alleles (we can not distinguish between AB and BA), which simplifies from all possible permutations to all possible combinations, e.g., for biallelic gene in tetraploids this would correspond to 5 combinations and 16 permutations.

Bellow is an R snippet that shows how to enumerate all possible combinations or permutations

library(package="gtools")

## Specify alleles - just two for simplicity

alleles <- c("A", "B")

## Possible genotypes for diploids

combinations(n=length(alleles), r=2, v=alleles, repeats.allowed=TRUE)

## [,1] [,2]

## [1,] "A" "A"

## [2,] "A" "B"

## [3,] "B" "B"

## Possible genotypes for tetraploids

combinations(n=length(alleles), r=4, v=alleles, repeats.allowed=TRUE)ž

## [,1] [,2] [,3] [,4]

## [1,] "A" "A" "A" "A"

## [2,] "A" "A" "A" "B"

## [3,] "A" "A" "B" "B"

## [4,] "A" "B" "B" "B"

## [5,] "B" "B" "B" "B"

permutations(n=length(alleles), r=4, v=alleles, repeats.allowed=TRUE)

## [,1] [,2] [,3] [,4]

## [1,] "A" "A" "A" "A"

## [2,] "A" "A" "A" "B"

## [3,] "A" "A" "B" "A"

## [4,] "A" "A" "B" "B"

## [5,] "A" "B" "A" "A"

## [6,] "A" "B" "A" "B"

## [7,] "A" "B" "B" "A"

## [8,] "A" "B" "B" "B"

## [9,] "B" "A" "A" "A"

## [10,] "B" "A" "A" "B"

## [11,] "B" "A" "B" "A"

## [12,] "B" "A" "B" "B"

## [13,] "B" "B" "A" "A"

## [14,] "B" "B" "A" "B"

## [15,] "B" "B" "B" "A"

## [16,] "B" "B" "B" "B"

## Possible genotypes for 8-14 ploids

spectrum <- seq(from=8, to=14, by=2)

nS <- length(spectrum)

retC <- vector(mode="list", length=nS)

retP <- vector(mode="list", length=nS)

for(i in 1:nS) {

retC[[i]] <- combinations(n=length(alleles), r=spectrum[i], v=alleles, repeats.allowed=TRUE)

retP[[i]] <- permutations(n=length(alleles), r=spectrum[i], v=alleles, repeats.allowed=TRUE)

}

combC <- sapply(retC, nrow)

combP <- sapply(retP, nrow)

cbind(spectrum, combC, combP)

## spectrum combC combP

## [1,] 8 9 256

## [2,] 10 11 1024

## [3,] 12 13 4096

## [4,] 14 15 16384

To

**leave a comment**for the author, please follow the link and comment on their blog:**Gregor Gorjanc (gg)**.R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...