How to Calculate Cosine Similarity in R

[This article was first published on Methods – finnstats, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

How to Calculate Cosine Similarity in R, The measure of similarity between two vectors in an inner product space is cosine similarity.

The formula for two vectors, like A and B and the Cosine Similarity can be calculated as follows

Cosine Similarity = ΣAiBi / (√ΣAi2√ΣBi2)

Mainly Cosine similarity is used to measure how similar the documents are irrespective of their size.

In other words, It calculates the cosine of an angle formed by two vectors projected in three dimensions.

This tutorial describes how to calculate the Cosine Similarity between vectors in R while using the cosine() function.

The cosine() function can be loaded from the lsa library.

LSTM Network in R » Recurrent Neural network »

How to Calculate Cosine Similarity in R

Let’s create two vectors x and y and assign some values to them.

#create vectors

x <- c(33, 33, 43, 55, 48, 37, 43, 24)
y <- c(37, 38, 42, 46, 46, 59, 41, 50)

#determine the Cosine Similarity

library(lsa)
cosine(x, y)
[1,] 0.9624844

Based on the above result, the Cosine Similarity between the x and y is 0.9624844.

Repeated Measures of ANOVA in R Complete Tutorial »

We can calculate the Cosine Similarity of a Matrix in R

Let’s create x, y, and z vectors and create a matrix.

#define matrix

x <- c(23, 24, 34, 35, 22, 25, 33, 24)
y <- c(10, 10, 22, 26, 16, 22, 11, 20)
z <- c(14, 15, 35, 16, 11, 23, 10, 41)
matrix <- cbind(x, y, z)

#calculate Cosine Similarity

Correlation Analysis in R? » Karl Pearson correlation coefficient »

library(lsa)
cosine(matrix)
      x         y         z
x 1.0000000 0.9561517 0.8761308
y 0.9561517 1.0000000 0.9163248
z 0.8761308 0.9163248 1.0000000

Now you can see the cosine similarity between x, y, and z.

The Cosine Similarity between vectors x and y is 0.9561517.

The Cosine Similarity between vectors x and z is 0.8761308.

The Cosine Similarity between vectors y and z is 0.9163248.

The cosine() function required either one matrix or two vectors needed as input.

How to Calculate Partial Correlation coefficient in R-Quick Guide »

The post How to Calculate Cosine Similarity in R appeared first on finnstats.

To leave a comment for the author, please follow the link and comment on their blog: Methods – finnstats.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)