Validating email adresses in R

[This article was first published on Nicebread » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I currently program an automated report generation in R – participants fill out a questionnaire, and they receive a nicely formatted pdf with their personality profile. I use knitr, LaTex, and the sendmailR package.

Some participants did not provide valid email addresses, which caused the sendmail function to crash. Therefore I wanted some validation of email addresses – here’s the function:

?View Code RSPLUS
isValidEmail <- function(x) {
	grepl("\\<[A-Z0-9._%+-]+@[A-Z0-9.-]+\\.[A-Z]{2,}\\>", as.character(x), ignore.case=TRUE)
}

Let’s test some valid and invalid adresses:

?View Code RSPLUS
# Valid adresses
isValidEmail("[email protected]")
isValidEmail("[email protected]")
isValidEmail("[email protected]  ")
isValidEmail("    [email protected]")
isValidEmail("[email protected]")
isValidEmail("[email protected]")
 
# invalid addresses
isValidEmail("felix@nicebread")  
isValidEmail("felix@nicebread@de")
isValidEmail("felixnicebread.de")

The regexp is taken from www.regular-expressions.info and adapted to the R style of regexp. Please note the many comments (e.g., here or here) about “Is there a single regexp that matches all valid email adresses?” (the answer is no).

To leave a comment for the author, please follow the link and comment on their blog: Nicebread » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)