# Basics of data.table: Smooth data exploration

August 23, 2017
By

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The `data.table` package provides perhaps the fastest way for data wrangling in R. The syntax is concise and is made to resemble SQL. After studying the basics of `data.table` and finishing this exercise set successfully you will be able to start easing into using `data.table` for all your data manipulation needs.

We will use data drawn from the 1980 US Census on married women aged 21–35 with two or more children. The data includes gender of first and second child, as well as information on whether the woman had more than two children, race, age and number of weeks worked in 1979. For more information please refer to the reference manual for the package AER.

Exercise 1
Load the `data.table` package. Furtermore (install and) load the `AER` package and run the command `data("Fertility")` which loads the dataset Fertility to your workspace. Turn it into a `data.table` object.

Exercise 2
Select rows 35 to 50 and print to console its age and work entry.

Exercise 3
Select the last row in the dataset and print to console.

Exercise 4
Count how many women proceeded to have a third child.

Learn more about the data.table package in the online course R Data Pre-Processing & Data Management – Shape your Data!. In this course you will learn how to

• work with different data manipulation packages,
• know how to import, transform and prepare your dataset for modelling,
• and much more.

Exercise 5
There are four possible gender combinations for the first two children. Which is the most common? Use the `by` argument.

Exercise 6
By racial composition what is the proportion of woman working four weeks or less in 1979?

Exercise 7
Use `%between%` to get a subset of woman between 22 and 24 calculate the proportion who had a boy as their firstborn.

Exercise 8
Add a new column, age squared, to the dataset.

Exercise 9
Out of all the racial composition in the dataset which had the lowest proportion of boys for their firstborn. With the same command display the number of observation in each category as well.

Exercise 10
Calculate the proportion of women who have a third child by gender combination of the first two children?

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.