**R – Discovering Python & R**, and kindly contributed to R-bloggers)

This happens to be my **50th blog post** – and my blog is **8 months old**.

This post is the **third and last post** in in a series of posts (Part 1 – Part 2) on data manipulation with *dlpyr*. Note that the objects in the code may have been defined in earlier posts and the code in this post is in continuation with code from the earlier posts.

Although datasets can be manipulated in sophisticated ways by linking the 5 verbs of *dplyr* in conjunction, linking verbs together can be a bit verbose.

Creating multiple objects, especially when working on a large dataset can slow you down in your analysis. Chaining functions directly together into one line of code is difficult to read. This is sometimes called the Dagwood sandwich problem: you have too much filling (too many long arguments) between your slices of bread (parentheses). Functions and arguments get further and further apart.

The *%>%* operator allows you to extract the first argument of a function from the arguments list and put it in front of it, thus solving the Dagwood sandwich problem.

**group_by()**

*group_by()* defines groups within a data set. Its influence becomes clear when calling *summarise()* on a grouped dataset. Summarizing statistics are calculated for the different groups separately.

**Combine group_by with mutate**

*group_by()* can also be combined with *mutate()*. When you mutate grouped data, *mutate()* will calculate the new variables independently for each group. This is particularly useful when *mutate()* uses the *rank()* function, that calculates within group rankings. *rank()* takes a group of values and calculates the rank of each value within the group, e.g.

*rank(c(21, 22, 24, 23))*

has output

*[1] 1 2 4 3*

As with *arrange()*, *rank()* ranks values from the largest to the smallest and this behaviour can be reversed with the *desc()* function.

**leave a comment**for the author, please follow the link and comment on their blog:

**R – Discovering Python & R**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...