New in openssl 0.3: hash functions

January 9, 2015
By

(This article was first published on OpenCPU, and kindly contributed to R-bloggers)

opencpu logo

This week version 0.3 of the openssl package appeared on CRAN. New in this release are bindings to the cryptographic hashning functions in OpenSSL. Not exactly ground breaking (hashing functions have long been available from digest) but nice to have anyway. An overview from the new vignette:

Hashing functions

The functions sha1, sha256, sha512, md4, md5 and ripemd160 bind to the respective digest functions in OpenSSL’s libcrypto. Both binary and string inputs are supported and the output type will match the input type.

library(openssl)
md5("foo")
# [1] "acbd18db4cc2f85cedef654fccc4a4d8"
md5(charToRaw("foo"))
# [1] ac bd 18 db 4c c2 f8 5c ed ef 65 4f cc c4 a4 d8

Functions are fully vectorized for the case of character vectors: a vector with n strings will return n hashes.

# Vectorized for strings
md5(c("foo", "bar", "baz"))
# [1] "acbd18db4cc2f85cedef654fccc4a4d8" "37b51d194a7513e45b56f6524f2d51f2"
# [3] "73feffa4b7f6bb68e44cf984c85f6e88"

Besides character and raw vectors we can pass a connection object (e.g. a file, socket or url). In this case the function will stream-hash the binary contents of the conection.

# Stream-hash a file
myfile <- system.file("CITATION")
md5(file(myfile))
# Hashing....
# [1] e4 4f 1b 99 e3 2f 27 e0 a7 e6 a0 0a 36 07 0e 1b

Same for URLs. The hash of the R-3.1.1-win.exe below should match the one in md5sum.txt

# Stream-hash from a network connection
md5(url("http://cran.us.r-project.org/bin/windows/base/old/3.1.1/R-3.1.1-win.exe"))
# Hashing................................................................................................................
# [1] 0b 48 29 e8 92 10 eb 6d 13 71 24 8c d0 97 d1 fc

Compare to digest

Similar functionality is also available in the digest package, but with a slightly different interface:

# Compare to digest
library(digest)
digest("foo", "md5", serialize = FALSE)
# [1] "acbd18db4cc2f85cedef654fccc4a4d8"

# Other way around
digest(cars, skip = 0)
# [1] "81919836edd7b5a422700ac32bbccd7d"
md5(serialize(cars, NULL))
# [1] 81 91 98 36 ed d7 b5 a4 22 70 0a c3 2b bc cd 7d

To leave a comment for the author, please follow the link and comment on their blog: OpenCPU.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)