Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Data wrangling, is the process of importing, cleaning and transforming raw data into actionable information for analysis. It is a time-consuming process which is estimated to take about 60-80% of analyst’s time. In this series we will go through this process. It will be a brief series with goal to craft the reader’s skills on the data wrangling task. This is the fourth part of the series and it aims to cover the cleaning of data used. At previous parts we learned how to import, reshape and transform data. The rest of the series will be dedicated to the data cleansing process. On this post we will go through the regular expressions, a sequence of characters that define a search pattern, mainly
for use in pattern matching with text strings.In particular, we will cover the foundations of regular expression syntax.

Before proceeding, it might be helpful to look over the help pages for the grep, gsub.

Moreover please run the following commands to create the strings that we will work on.
bio <- c('24 year old', 'data scientist', '1992', 'A.I. enthusiast', 'R version 3.4.0 (2017-04-21)', 'r-exercises author', 'R is cool', 'RR')

Answers to the exercises are available here.

If you obtained a different (correct) answer than those listed on the solutions page, please feel free to post your answer as a comment on that page.

Exercise 1

Find the strings with Numeric values between 3 and 6.

Exercise 2

Find the strings with the character ‘A’ or ‘y’.

Exercise 3

Find any strings that have non-alphanumeric characters.

Exercise 4

Remove lower case letters.

Learn more about Text analysis in the online course Text Analytics/Text Mining Using R. In this course you will learn how create, analyse and finally visualize your text based data source. Having all the steps easily outlined will be a great reference source for future work.

Exercise 5

Remove space or tabs.

Exercise 6

Remove punctuation and replace it with white space.

Exercise 7

Remove alphanumeric characters.

Exercise 8

Match sentences that contain ‘M’.

Exercise 9

Match states with two ‘o’.

Exercise 10

Match cars with one or two ‘e’.