# Data wrangling : Transforming (2/3)

**R-exercises**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Data wrangling is a task of great importance in data analysis. Data wrangling, is the process of importing, cleaning and transforming raw data into actionable information for analysis. It is a time-consuming process which is estimated to take about 60-80% of analyst’s time. In this series we will go through this process. It will be a brief series with goal to craft the reader’s skills on the data wrangling task. This is the third part of the series and it aims to cover the transforming of data used.This can include filtering, summarizing, and ordering your data by different means. This also includes combining various data sets, creating new variables, and many other manipulation tasks. At this post, we will go through a few more advanced transformation tasks on `mtcars`

data set.

Before proceeding, it might be helpful to look over the help pages for the `group_by`

, `ungrpoup`

, `summary`

, `summarise`

, `arrange`

, `mutate`

, `cumsum`

.

Moreover please load the following libraries.

`install.packages("dplyr")`

`library(dplyr)`

Answers to the exercises are available here.

If you obtained a different (correct) answer than those listed on the solutions page, please feel free to post your answer as a comment on that page.

Exercise 1

Create a new object named *cars_cyl* and assign to it the *mtcars* data frame grouped by the variable *cyl*

Hint: be careful about the data type of the variable, in order to be used for grouping it has to be a factor.

Exercise 2

Remove the grouping from the object *cars_cyl*

Exercise 3

Print out the summary statistics of the *mtcars* data frame using the summary function and pipeline symbols *%>%*.

Exercise 4

Make a more descriptive summary statistics output containing the 4 quantiles, the mean, the standard deviation and the count.

Exercise 5

Print out the average *hp* for every *cyl* category

Exercise 6

Print out the *mtcars* data frame sorted by *hp* (ascending oder)

Exercise 7

Print out the *mtcars* data frame sorted by *hp* (descending oder)

Exercise 8

Create a new object named *cars_per* containing the *mtcars* data frame along with a new variable called *performance* and calculated as `performance = hp/mpg`

Exercise 9

Print out the cars_per data frame, sorted by *performance* in descending order and create a new variable called *rank* indicating the rank of the cars in terms of performance.

Exercise 10

To wrap everything up, we will use the *iris* data set. Print out the mean of every variable for every *Species* and create two new variables called *Sepal.Density* and *Petal.Density* being calculated as `Sepal.Density = Sepal.Length Sepal.Width`

and `Petal.Density = Sepal.Length Petal.Width`

respectively.

**leave a comment**for the author, please follow the link and comment on their blog:

**R-exercises**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.