data.table package provides perhaps the fastest way for data wrangling in R. The syntax is concise and is made to resemble SQL. After studying the basics of
data.table and finishing this exercise set successfully you will be able to start easing into using
data.table for all your data manipulation needs.
We will use data drawn from the 1980 US Census on married women aged 21–35 with two or more children. The data includes gender of first and second child, as well as information on whether the woman had more than two children, race, age and number of weeks worked in 1979. For more information please refer to the reference manual for the package AER.
Answers are available here.
data.table package. Furtermore (install and) load the
AER package and run the command
data("Fertility") which loads the dataset Fertility to your workspace. Turn it into a
Select rows 35 to 50 and print to console its age and work entry.
Select the last row in the dataset and print to console.
Count how many women proceeded to have a third child.
There are four possible gender combinations for the first two children. Which is the most common? Use the
By racial composition what is the proportion of woman working four weeks or less in 1979?
%between% to get a subset of woman between 22 and 24 calculate the proportion who had a boy as their firstborn.
Add a new column, age squared, to the dataset.
Out of all the racial composition in the dataset which had the lowest proportion of boys for their firstborn. With the same command display the number of observation in each category as well.
Calculate the proportion of women who have a third child by gender combination of the first two children?