Keyword Searches from Comma Separated Terms

April 27, 2017
By

(This article was first published on RLang.io | R Language Programming, and kindly contributed to R-bloggers)

Long story short, I need to convert a pretty simple OR search to a non-directional AND keyword search. Direction is straightforward, with just using [.*?] between words (or in SQL using LIKE keyword_1%keyword_2). Anyhow, I came up with this little function and thought I would share.

keyword_search <- paste0(sapply(unlist(strsplit("keyword_1,keyword_2", ",")),function(x) {
 return(paste0("(?=.*?(",x,"))"))
}),collapse="")

Now this sets keyword_search to a really nice regular expression that can be used with grep.

NOTE: You will need to use PERL = TRUE when using the generated regular expression.

(?=.*?(keyword_1))(?=.*?(keyword_2))

Results from regex101 show the following breakdown for the curious

Positive Lookahead
  • (?=.*?(keyword_1))
  • Assert that the Regex below matches
  • .*? matches any character
  • *? Quantifier — Matches between zero and unlimited times, as few times as possible, expanding as needed (lazy)
 1st Capturing Group
  • (keyword_1)
  • keyword_1 matches the characters keyword_1
Positive Lookahead
  • (?=.*?(keyword_2))
  • Assert that the Regex below matches
  • .*? matches any character
  • *? Quantifier — Matches between zero and unlimited times, as few times as possible, expanding as needed (lazy)
 2nd Capturing Group
  • (keyword_2)
  • keyword_2 matches the characters keyword_2l

To leave a comment for the author, please follow the link and comment on their blog: RLang.io | R Language Programming.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)