How to Calculate Cosine Similarity in R, The measure of similarity between two vectors in an inner product space is cosine similarity.
The formula for two vectors, like A and B and the Cosine Similarity can be calculated as follows
Cosine Similarity = ΣAiBi / (√ΣAi2√ΣBi2)
Mainly Cosine similarity is used to measure how similar the documents are irrespective of their size.
In other words, It calculates the cosine of an angle formed by two vectors projected in three dimensions.
This tutorial describes how to calculate the Cosine Similarity between vectors in R while using the cosine() function.
The cosine() function can be loaded from the lsa library.
How to Calculate Cosine Similarity in R
Let’s create two vectors x and y and assign some values to them.
x <- c(33, 33, 43, 55, 48, 37, 43, 24) y <- c(37, 38, 42, 46, 46, 59, 41, 50)
#determine the Cosine Similarity
library(lsa) cosine(x, y) [1,] 0.9624844
Based on the above result, the Cosine Similarity between the x and y is 0.9624844.
We can calculate the Cosine Similarity of a Matrix in R
Let’s create x, y, and z vectors and create a matrix.
x <- c(23, 24, 34, 35, 22, 25, 33, 24) y <- c(10, 10, 22, 26, 16, 22, 11, 20) z <- c(14, 15, 35, 16, 11, 23, 10, 41) matrix <- cbind(x, y, z)
#calculate Cosine Similarity
library(lsa) cosine(matrix) x y z x 1.0000000 0.9561517 0.8761308 y 0.9561517 1.0000000 0.9163248 z 0.8761308 0.9163248 1.0000000
Now you can see the cosine similarity between x, y, and z.
The Cosine Similarity between vectors x and y is 0.9561517.
The Cosine Similarity between vectors x and z is 0.8761308.
The Cosine Similarity between vectors y and z is 0.9163248.
The cosine() function required either one matrix or two vectors needed as input.