# Permutation Theory In Action

**R – Win-Vector Blog**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

While working on a large client project using `Sparklyr`

and multinomial regression we recently ran into a problem: `Apache Spark`

chooses the order of multinomial regression outcome targets, whereas `R`

users are used to choosing the order of the targets (please see here for some details). So to make things more like `R`

users expect, we need a way to translate one order to another.

Providing good solutions to gaps like this is one of the thing Win-Vector LLC does both in our consulting and training practices.

Let’s take a look at an example. Suppose our two orderings are `o1`

(the ordering `Spark ML`

chooses) and `o2`

(the order the `R`

user chooses).

set.seed(326346) symbols <- letters[1:7] o1 <- sample(symbols, length(symbols), replace = FALSE) o1

## [1] "e" "a" "b" "f" "d" "c" "g"

o2 <- sample(symbols, length(symbols), replace = FALSE) o2

## [1] "d" "g" "f" "e" "b" "c" "a"

To translate `Spark`

results into `R`

results we need a permutation that takes `o1`

to `o2`

. The idea is: if we had a permeation that takes `o1`

to `o2`

we could use it to re-map predictions that are in `o1`

order to be predictions in `o2`

order.

To solve this we crack open our article on the algebra of permutations.

We are going to use the fact that the `R`

command `base::order(x)`

builds a permutation `p`

such that `x[p]`

is in order.

Given this the solution is: we find permutations `p1`

and `p2`

such that `o1[p1]`

is ordered and `o2[p2]`

is ordered. Then build a permutation `perm`

such that `o1[perm] = (o1[p1])[inverse_permutation(p2)]`

. I.e., to get from `o1`

to `o2`

move `o1`

to sorted order and then move from the sorted order to `o2`

‘s order (by using the reverse of the process that sorts `o2`

). Again, the tools to solve this are in our article on the relation between permutations and indexing.

Below is the complete solution (including combining the two steps into a single permutation):

p1 <- order(o1) p2 <- order(o2) # invert p2 # see: http://www.win-vector.com/blog/2017/05/on-indexing-operators-and-composition/ p2inv <- seq_len(length(p2)) p2inv[p2] <- seq_len(length(p2)) (o1[p1])[p2inv]

## [1] "d" "g" "f" "e" "b" "c" "a"

# composition rule: (o1[p1])[p2inv] == o1[p1[p2inv]] # see: http://www.win-vector.com/blog/2017/05/on-indexing-operators-and-composition/ perm <- p1[p2inv] o1[perm]

## [1] "d" "g" "f" "e" "b" "c" "a"

The equivilence "`(o1[p1])[p2inv] == o1[p1[p2inv]]`

" is frankly magic (though also quickly follows "by definition"), and studying it is the topic of our original article on permutations.

The above application is a good example of why it is nice to have a little theory worked out, even before you think you need it.

**leave a comment**for the author, please follow the link and comment on their blog:

**R – Win-Vector Blog**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.