Extract patterns in R?

[This article was first published on Data Science Tutorials, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The post Extract patterns in R? appeared first on Data Science Tutorials

What do you have to lose?. Check out Data Science tutorials here Data Science Tutorials.

Extract patterns in R, R’s str extract() function can be used to extract matching patterns from strings. It is part of the stringr package.

The syntax for this function is as follows:

str_extract(string, pattern)

where:

string: Character vector

pattern: Pattern to extract

The practical application of this function is demonstrated in the examples that follow.

Data Science Challenges in R Programming Language

Example 1: Take a String and Extract One Pattern

The R code below demonstrates how to separate the word “for” from a specific string.

library(stringr)

Let’s define string

string <- "datascience.com for data science articles"

Now we can extract “for” from string

str_extract(string, "for")
[1] "for"

The pattern “for” was successfully extracted from the string.

How to add columns to a data frame in R – Data Science Tutorials

Note that we will simply get NA if we try to extract a pattern that isn’t present in the string.

Example 2: Take String Data and Extract Numeric Values

Use the regex d+ to extract just numerical values from a text using the following code.

library(stringr)

Now we can define string

string <- "There are 100 phones over there"

extract only numeric values from string

Triangular Distribution in R – Data Science Tutorials

str_extract(string, "\\d+")
[1] "100"

Example 3: Take Strings from a Vector and Extract Characters

The code below demonstrates how to extract only characters from a vector of strings using the regex [a-z]+.

Let’s define a vector of strings

strings <- c("3 phones", "3 battery", "7 pen") 

Now let’s try to extract only characters from each string in the vector

str_extract(strings, "[a-z]+")
[1] "phones"  "battery" "pen" 

Take note that each string’s characters are the only ones that are returned.

The Multinomial Distribution in R – Data Science Tutorials

The post Extract patterns in R? appeared first on Data Science Tutorials

Learn how to expert in the Data Science field with Data Science Tutorials.

To leave a comment for the author, please follow the link and comment on their blog: Data Science Tutorials.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)