Blog Archives

Introduction to tibbles

September 27, 2018
By

Introduction A tibble, or tbl_df, is a modern reimagining of the data.frame, keeping what time has proven to be effective, and throwing out what is not. Tibbles are data.frames that are lazy and surly: they do less (i.e. they don’t change variable names or types, and don’t do partial matching) and complain more (e.g. when a variable does not exist). This forces...

Read more »

Data Wrangling with dplyr – Part 3

September 15, 2018
By
Data Wrangling with dplyr – Part 3

Introduction In the previous post, we learnt to combine tables using dplyr. In this post, we will explore a set of helper functions in order to: extract unique rows rename columns sample data extract columns slice rows arrange rows compare tables extract/mutate data using predicate functions count observations for different levels of a variable Libraries, Code & Data We will use the following packages: dplyr readr The data sets can be downloaded from here and the codes...

Read more »

Data Wrangling with dplyr – Part 2

September 3, 2018
By
Data Wrangling with dplyr – Part 2

Introduction In the previous post we learnt about dplyr verbs and used them to compute average order value for an online retail company data. In this post, we will learn to combine tables using different *_join functions provided in dplyr. Libraries, Code & Data We will use the following packages: dplyr readr The data sets can be downloaded from here and the codes from here. library(dplyr) library(readr) options(tibble.width = Inf) Case Study For...

Read more »

Data Wrangling with dplyr – Part 1

August 22, 2018
By
Data Wrangling with dplyr – Part 1

Introduction According to a survey by CrowdFlower, data scientists spend most of their time cleaning and manipulating data rather than mining or modeling them for insights. As such, it becomes important to have tools that make data manipulation faster and easier. In today’s post, we introduce you to dplyr, a grammar of data manipulation. Libraries, Code & Data We will use the following libraries: dplyr and readr The data...

Read more »

Import Data into R – Part 2

August 10, 2018
By
Import Data into R – Part 2

Introduction This is the second post in the series Importing Data into R. In the previous post, we explored reading data from flat/delimited files. In this post, we will: list sheets in an excel file read data from an excel sheet read specific cells from an excel sheet read specific rows read specific columns read data from - SAS - SPSS - STATA Libraries, Data & Code We will use the readxl package....

Read more »

Import Data into R – Part 1

July 29, 2018
By
Import Data into R – Part 1

Introduction In this post, we will learn to: read data from flat or delimited files handle column names/header skip text/info present before data specify column/variable types read specific columns/variables Libraries, Data & Code We will use the readr package. The data sets can be downloaded from here and the codes from here. library(readr) Types of Delimiters Before we start reading data from files, let us take a quick look at the different types...

Read more »

ggplot2: Themes

May 6, 2018
By
ggplot2: Themes

Introduction This is the last post in the series Elegant Data Visualization with ggplot2. In the previous post, we learnt to combine multiple plots. In this post, we will learn to modify the appearance of all non data components of the plot such as: axis legend panel plot area background margin facets Libraries, Code & Data We will use the following libraries in this post: readr ggplot2 All the data sets used in this post...

Read more »

ggplot2: Faceting

April 24, 2018
By
ggplot2: Faceting

Introduction This is the 19th post in the series Elegant Data Visualization with ggplot2. In the previous post, we learnt to modify the title, label and bar of a legend. In this post, we will learn about faceting i.e. combining plots. Libraries, Code & Data We will use the following libraries in this post: readr ggplot2 All the data sets used in this post can be found here...

Read more »

ggplot2: Legend – Part 6

April 12, 2018
By
ggplot2: Legend – Part 6

Introduction This is the 18th post in the series Elegant Data Visualization with ggplot2. In the previous post, we learnt how to modify the legend of plot when alpha is mapped to a categorical variable. In this post, we will learn to modify legend title label and bar So far, we have learnt to modify the components of a legend using scale_* family of functions. Now, we...

Read more »

ggplot2: Legend – Part 5

March 31, 2018
By
ggplot2: Legend – Part 5

Introduction This is the 17th post in the series Elegant Data Visualization with ggplot2. In the previous post, we learnt how to modify the legend of plot when size is mapped to continuous variable. In this post, we will learn to modify the following using scale_alpha_continuous() when alpha or transparency is mapped to variables: title breaks limits range labels values Libraries, Code & Data We will use the following libraries in this...

Read more »

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)