Earnings of Immigrants

Posted on August 14, 2020 by R on Sam Portnow's Blog in R bloggers | 0 Comments

[This article was first published on R on Sam Portnow's Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Introduction

For this post, I want to look at the earnings of immigrants by the year that they immigrate to the United States. I want to to this because I am interested in looking to see if the earnings of immigrants are declining over time. I hear a constant debate in the media on whether or not the American Dream is dead. I thought that looking into the earnings of immigrants is a particularly good way to get an answer to this debate. More specifically, I want to look at the income of immigrants from the same country at the same age who have been in the US for the same amount of time, so that the I can see isolate the assocation of year of immigration on income. If the American Dream is truly dying, then immigrants who arrive in the US later will earn less on average than similar immigrants who came before them.

knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(warning = FALSE)
knitr::opts_chunk$set(message = FALSE)

library(tidyverse)
## ── Attaching packages ─────────────────────────────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.2     ✓ purrr   0.3.4
## ✓ tibble  3.0.3     ✓ dplyr   1.0.0
## ✓ tidyr   1.1.0     ✓ stringr 1.4.0
## ✓ readr   1.3.1     ✓ forcats 0.5.0
## Warning: package 'ggplot2' was built under R version 3.6.2
## Warning: package 'tibble' was built under R version 3.6.2
## Warning: package 'tidyr' was built under R version 3.6.2
## Warning: package 'purrr' was built under R version 3.6.2
## Warning: package 'dplyr' was built under R version 3.6.2
## ── Conflicts ────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(here)
## here() starts at /Users/samportnow/sam-portnow-website
library(fs)
## Warning: package 'fs' was built under R version 3.6.2
library(ipumsr)
## Warning: package 'ipumsr' was built under R version 3.6.2
library(skimr)
## Warning: package 'skimr' was built under R version 3.6.2
library(janitor)
## Warning: package 'janitor' was built under R version 3.6.2
## 
## Attaching package: 'janitor'
## The following objects are masked from 'package:stats':
## 
##     chisq.test, fisher.test

Look at the data

ddi <- read_ipums_ddi(here::here('content', 'post_data', 'cps', "cps_00010.xml"))
data <- read_ipums_micro(ddi)
## Use of data from IPUMS CPS is subject to conditions including that users should
## cite the data appropriately. Use command `ipums_conditions()` for more details.
data %>% glimpse()
## Rows: 9,223,447
## Columns: 28
## $ YEAR         <dbl> 1962, 1962, 1962, 1962, 1962, 1962, 1962, 1962, 1962, 19…
## $ SERIAL       <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 8, 9, 9, 10, 10, 11, 12, 12, 13,…
## $ MONTH        <int+lbl> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3…
## $ CPSID        <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ ASECFLAG     <int+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ HFLAG        <int+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ ASECWTH      <dbl> 1494.99, 1568.53, 6443.03, 1473.97, 1538.21, 6427.43, 65…
## $ REGION       <int+lbl> 11, 11, 12, 11, 11, 12, 12, 12, 12, 12, 12, 12, 12, …
## $ PERNUM       <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 2, 1, 2, 1, 1, 2, 1, 2, 3,…
## $ CPSIDP       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ ASECWT       <dbl> 1494.99, 1568.53, 6443.03, 1473.97, 1538.21, 6427.43, 65…
## $ UH_ETHNIC_A1 <int+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ AGE          <int+lbl> 18, 14, 72, 29, 21, 60, 56, 20, 20, 20, 20, 19, 19, …
## $ SEX          <int+lbl> 2, 1, 1, 1, 1, 2, 1, 2, 2, 2, 2, 2, 2, 2, 1, 2, 1, 2…
## $ MARST        <int+lbl> 6, 6, 6, 1, 1, 6, 5, 6, 6, 6, 6, 6, 6, 6, 1, 1, 1, 1…
## $ NCHILD       <int+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ YNGCH        <int+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ BPL          <int+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ YRIMMIG      <int+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ MBPL         <int+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ FBPL         <int+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ NATIVITY     <int+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ OCC2010      <int+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ EDUC99       <int+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ FTOTVAL      <dbl+lbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ INCTOT       <dbl+lbl>         0, 999999999, 999999999,      1692,      152…
## $ INCWAGE      <dbl+lbl>        0, 99999999, 99999999,     1692,     1522,   …
## $ UH_FMIED_A1  <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …

I want to recode some variables to numeric

data =  data %>%
  mutate_at(vars(YRIMMIG, INCTOT, INCWAGE, AGE), as.double)

Clean Data

I’ll need to adjust income so it’s in the same scale for everyone. To do taht. I’ll scale all dollars to 2020 dollars. Advice for this scaling is taken from this stackoverflow post, and this blog post

require(lubridate)
monthly_cpi <-
  read.csv("https://fred.stlouisfed.org/graph/fredgraph.csv?bgcolor=%23e1e9f0&chart_type=line&drp=0&fo=open%20sans&graph_bgcolor=%23ffffff&height=450&mode=fred&recession_bars=on&txtcolor=%23444444&ts=12&tts=12&width=1168&nt=0&thu=0&trc=0&show_legend=yes&show_axis_titles=yes&show_tooltip=yes&id=CPIAUCSL&scale=left&cosd=1947-01-01&coed=2020-07-01&line_color=%234572a7&link_values=false&line_style=solid&mark_type=none&mw=3&lw=2&ost=-99999&oet=99999&mma=0&fml=a&fq=Monthly&fam=avg&fgst=lin&fgsnd=2020-02-01&line_index=1&transformation=lin&vintage_date=2020-08-15&revision_date=2020-08-15&nd=1947-01-01", header = TRUE) %>% mutate(DATE = as.character(DATE))
monthly_cpi$cpi_year <- year(monthly_cpi$DATE)
yearly_cpi <- monthly_cpi %>% group_by(cpi_year) %>% summarize(cpi = mean(CPIAUCSL))
yearly_cpi %>% distinct(cpi_year)
## # A tibble: 74 x 1
##    cpi_year
##       <dbl>
##  1     1947
##  2     1948
##  3     1949
##  4     1950
##  5     1951
##  6     1952
##  7     1953
##  8     1954
##  9     1955
## 10     1956
## # … with 64 more rows
data = left_join(data, yearly_cpi, by = c('YEAR' = 'cpi_year'))
data$INCWAGEAdjusted = (data$INCWAGE/data$cpi) * 100
data$INCTOTAdjusted = (data$INCTOT/data$cpi) * 100
data$FTOTVALAdjusted = (data$FTOTVAL/data$cpi) * 100

The big problem with income is that it is not normally distributed, as we can see below.

library(ggthemes)

ggplot(data, aes(x = INCTOT)) + geom_density() + theme_hc()

The upper values are top-coded to protect identity; I’m going to get rid of them because they are clearly outliers from the rest of the data (and also measured imprecisely).

data = data %>%
  filter(INCTOT < 999999998)

I also only want people who are working.

data = data %>%
  filter(INCWAGE > 0)

And focus on people with a valid immigration year.

data = data %>% filter(YRIMMIG > 0)

Still, income is not normaly distributed. I am going to use a suggestion from Andrew Gelman: rank transform income, and then take the inverse normal to get a normal distribution

ranking = rank(data$INCWAGEAdjusted, ties.method = "random")
p = ranking / ( length(ranking) + 1 )
data$INCWAGEInverse = qnorm( p, 0, 1 );

Let’s also take a look at the years people immigrated

ggplot(data, aes(x = YRIMMIG)) + geom_density() + theme_hc()

And from where

library(gt)

lkup = ipums_val_labels(data, BPL) %>% rename(BirthPlace = lbl)

data = left_join(data, lkup, by = c('BPL' = 'val'))

And I only want to focus on immigrants.

data = data %>% 
  filter(! is.na(BirthPlace)  & BirthPlace != 'United States, n.s.')

I am going to bucket these birth places into regions. I’ll use the groupings of the World Bank.

library(rvest)

data$Region = if_else(data$BirthPlace == 'United States, n.s.' | is.na(data$BirthPlace), 'United States', NA_character_)

regions =   c("https://data.worldbank.org/region/east-asia-and-pacific",
              "https://data.worldbank.org/region/europe-and-central-asia",
              "https://data.worldbank.org/region/latin-america-and-caribbean",
              "https://data.worldbank.org/region/middle-east-and-north-africa",
              "https://data.worldbank.org/region/north-america",
              "https://data.worldbank.org/region/south-asia",
              "https://data.worldbank.org/region/sub-saharan-africa")

regions_dfr = map_dfr(regions, function(x){
   html(x) %>% html_nodes("li") %>%
    html_text() %>% tibble() %>%
    mutate(region = x)
})


names(regions_dfr)[1] = 'Country'

data = left_join(data, regions_dfr, by = c('BirthPlace' = 'Country')) 


data = data %>%
  mutate(Region = case_when(
    ! is.na(Region) ~ Region,
    region == "https://data.worldbank.org/region/east-asia-and-pacific" ~ 'East Asia/Pacific',
    region == "https://data.worldbank.org/region/europe-and-central-asia" ~ 'Europe/Central Asia',
    region == "https://data.worldbank.org/region/latin-america-and-caribbean" ~ 'Latin America/Caribbean',
    region == "https://data.worldbank.org/region/middle-east-and-north-africa" ~ 'Middle East/North Africa',
    region == "https://data.worldbank.org/region/north-america" ~ 'North America',
    region == "https://data.worldbank.org/region/south-asia" ~ 'South Asia',
    region == "https://data.worldbank.org/region/sub-saharan-africa" ~ 'Sub-Saharan Africa')
  )


data = data %>%
  mutate(
    Region = case_when(
      ! is.na(Region) ~ Region,
      str_detect(BirthPlace, 'Africa|Cape Verde|Ivory Coast|Congo|Zaire') ~ 'Sub-Saharan Africa',
      str_detect(BirthPlace, 'Asia|Pacific Islands|Taiwan|Hong Kong|Burma|Micronesia') ~ 'East Asia/Pacific',
      BirthPlace == 'Bahamas' ~ 'Latin America/Caribbean',
      str_detect(BirthPlace, 'Caribbean|Belize|Venezuela|St\\. Kitts|St\\. Vincent|South America') ~ 'Latin America/Caribbean',
      str_detect(BirthPlace, 'Central America|Guyana') ~ 'Latin America/Caribbean',
      str_detect(BirthPlace, 'Europe|England|Azores|Czechoslavakia|Macedonia|Scotland|Yugoslavia|Northern Ireland|USSR|United Kingdom|Slovakia|Wales|Montenego|Germany') ~ 'Europe/Central Asia',
      str_detect(BirthPlace, 'Egypt/United Arab Rep|Middle East|Palestine|Syria|Yemen|Iran') ~ 'Middle East/North Africa',
      str_detect(BirthPlace, 'Korea|Laos|Cambodia') ~ 'East Asia/Pacific',
      BirthPlace %in% c('U.S. outlying areas, n.s.', 'U.S. Virgin Islands', 'North America, n.s.|Canada') ~ 'North America',
      ! is.na(Region) ~ Region,
      TRUE ~ 'Other'
      
    )
  )
library(janitor)

First, I’ll bucket age, years of immigration, and length of time in the US.

data = data %>%
  mutate(
    AgeBucket = case_when(
      AGE < 18 ~ 'Less than 18',
      AGE < 25 ~ '18 - 24',
      AGE < 35 ~ '25 - 34',
      AGE < 45 ~ '35 - 44',
      AGE < 55 ~ '45 - 54',
      AGE < 65 ~ '55 - 64',
      AGE < 100 ~ '65+'
    )
  )

data$AgeBucket = factor(data$AgeBucket, levels = c(
  'Less than 18',
  '18 - 24',
  '25 - 34',  
  '35 - 44',  
  '45 - 54',  
  '55 - 64',  
  '65+'
))

data = data %>%
  mutate(
    ImmigrationYearBucket = case_when(
      YRIMMIG < 1955 ~ 'Before 1955',
      YRIMMIG < 1965 ~ '1955 - 1964',
      YRIMMIG < 1975 ~ '1965 - 1974',
      YRIMMIG < 1985 ~ '1975 - 1984',
      YRIMMIG < 1995 ~ '1985 - 1994',
      YRIMMIG < 2005 ~ '1995 - 2004',
      YRIMMIG < 2015 ~ '2005 - 2014',
      YRIMMIG >= 2015  ~ '2015 or after',
      TRUE ~ 'Born in US'
    )
  )

data$ImmigrationYearBucket = factor(data$ImmigrationYearBucket, 
                                    levels = c('Born in US', 'Before 1955', '1955 - 1964',
                                               '1965 - 1974', '1975 - 1984',
                                               '1985 - 1994', '1995 - 2004',
                                               '2005 - 2014', '2015 or after'))

data$YearsInUS = data$YEAR - data$YRIMMIG

data = data %>%
  mutate(
    YearsUSBucket = 
      case_when(
        YearsInUS < 10 ~ 'Less than 10',
        YearsInUS < 20 ~ '10-19',
        YearsInUS < 30 ~ '20-29',
        YearsInUS < 40 ~ '30-39',
        YearsInUS < 50 ~ '40-49',
        YearsInUS < 60 ~ '50-59',
        YearsInUS >= 60 ~ '60+',
        TRUE ~ 'Born in US'
        )
  )

data$YearsUSBucket = factor(data$YearsUSBucket, levels = c('Born in US', 'Less than 10', '10-19', '20-29', '30-39', '40-49', '50-59', '60+'))

There are two other variables here that are potentially important: Education and Gender. It is possible that the composition of wage-earning immigrants is changing. Maybe they are less educated and earn less, maybe a higher share are woman who earn less than their male counterparts.

I’ll recode education.

ipums_val_labels(data, EDUC99) %>% gt()

val	lbl
0	NIU
1	No school completed
4	1st-4th grade
5	5th-8th grade
6	9th grade
7	10th grade
8	11th grade
9	12th grade, no diploma
10	High school graduate, or GED
11	Some college, no degree
12	Associate degree, type of program not specified
13	Associate degree, occupational program
14	Associate degree, academic program
15	Bachelors degree
16	Masters degree
17	Professional degree
18	Doctorate degree

data = data %>%
  mutate(
    Education = case_when(
      EDUC99 == 0 ~ 'Missing',
      EDUC99 < 10 ~ 'Less than high school',
      EDUC99 == 10 ~ 'High school',
      EDUC99 < 15 ~ 'Some college',
      EDUC99 == 15 ~ 'College',
      TRUE ~ 'Postgraduate'
    )
  )

data$Education = factor(data$Education, levels = c('Less than high school', 'High school', 
                                                   'Some college', 'College', 'Postgraduate', 'Missing'))

data$Gender = if_else(data$SEX == 1, 'Male', 'Female')

Years in the US and Immigration Year are going to be tough to model together because they are clearly dependent. I am going to combine them into a single variable.

data = data %>% 
  mutate(
    YearsUSImmigration = paste0(YearsUSBucket, '-', ImmigrationYearBucket)
  )


data$YearsUSImmigration = factor(data$YearsUSImmigration, levels = c(
  'Born in US-Born in US',
  '60+-Before 1955', '60+-1955 - 1964',
  '50-59-Before 1955', '50-59-1955 - 1964', '50-59-1965 - 1974', '50-59-1975 - 1984',
  '40-49-Before 1955', '40-49-1955 - 1964', '40-49-1965 - 1974', '40-49-1975 - 1984',
  '30-39-1955 - 1964', '30-39-1965 - 1974', '30-39-1975 - 1984', '30-39-1985 - 1994',
  '20-29-1965 - 1974' , '20-29-1975 - 1984', '20-29-1985 - 1994', '20-29-1995 - 2004',
  '10-19-1975 - 1984',  '10-19-1985 - 1994', '10-19-1995 - 2004',  '10-19-2005 - 2014',
  'Less than 10-1985 - 1994', 'Less than 10-1995 - 2004',  'Less than 10-2005 - 2014', 'Less than 10-2015 or after' 
))

The Model

Finally, I am going to run a model; I chose to run a mixed effect model with the following fixed effects:

Year of Survey
Age (Factor)
Education
Gender
Region of Birth

and the following random effect:

Year of Immigration + Years in US (factor)

As shown above, Year of Immigration and Years in US are far too dependent to include in the model together, so I combined them into a single varialble.

library(lme4)
library(sjPlot)

data$YEAR = as.factor(data$YEAR)

slope_mod = lmer(INCWAGEInverse ~ YEAR + AgeBucket + Education + Gender + Region +
                   (1|YearsUSImmigration), 
           data = data)
summary(slope_mod)
## Linear mixed model fit by REML ['lmerMod']
## Formula: INCWAGEInverse ~ YEAR + AgeBucket + Education + Gender + Region +  
##     (1 | YearsUSImmigration)
##    Data: data
## 
## REML criterion at convergence: 903630.8
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -6.5654 -0.5760  0.0794  0.6587  6.0671 
## 
## Random effects:
##  Groups             Name        Variance Std.Dev.
##  YearsUSImmigration (Intercept) 0.02334  0.1528  
##  Residual                       0.65242  0.8077  
## Number of obs: 374637, groups:  YearsUSImmigration, 25
## 
## Fixed effects:
##                                 Estimate Std. Error t value
## (Intercept)                    -1.998775   0.036001 -55.521
## YEAR1995                        0.038723   0.011995   3.228
## YEAR1996                        0.026621   0.012313   2.162
## YEAR1997                        0.041008   0.012176   3.368
## YEAR1998                        0.095301   0.012153   7.842
## YEAR1999                        0.126048   0.012601  10.003
## YEAR2000                        0.145464   0.012380  11.750
## YEAR2001                        0.202631   0.011864  17.080
## YEAR2002                        0.226440   0.011857  19.097
## YEAR2003                        0.210559   0.012194  17.268
## YEAR2004                        0.216849   0.012464  17.399
## YEAR2005                        0.226757   0.012556  18.060
## YEAR2006                        0.243310   0.012457  19.532
## YEAR2007                        0.261154   0.012497  20.898
## YEAR2008                        0.250812   0.012521  20.031
## YEAR2009                        0.242780   0.012769  19.013
## YEAR2010                        0.214823   0.012752  16.846
## YEAR2011                        0.178449   0.013066  13.658
## YEAR2012                        0.184309   0.013091  14.079
## YEAR2013                        0.200464   0.013323  15.046
## YEAR2014                        0.227658   0.013418  16.967
## YEAR2015                        0.259999   0.013498  19.261
## YEAR2016                        0.298371   0.013580  21.971
## YEAR2017                        0.315563   0.013669  23.086
## YEAR2018                        0.326378   0.013714  23.800
## YEAR2019                        0.347501   0.014060  24.716
## AgeBucket18 - 24                0.659285   0.015686  42.031
## AgeBucket25 - 34                1.198425   0.015318  78.237
## AgeBucket35 - 44                1.345345   0.015314  87.853
## AgeBucket45 - 54                1.339351   0.015431  86.796
## AgeBucket55 - 64                1.257939   0.015726  79.994
## AgeBucket65+                    0.919224   0.017060  53.883
## EducationHigh school            0.255219   0.003743  68.184
## EducationSome college           0.394912   0.004174  94.613
## EducationCollege                0.820474   0.004476 183.314
## EducationPostgraduate           1.244865   0.005205 239.158
## GenderMale                      0.466550   0.002686 173.707
## RegionEurope/Central Asia       0.049216   0.005115   9.622
## RegionLatin America/Caribbean  -0.090614   0.003829 -23.668
## RegionMiddle East/North Africa -0.047811   0.009549  -5.007
## RegionNorth America             0.082386   0.009204   8.951
## RegionOther                    -0.043507   0.010585  -4.110
## RegionSouth Asia                0.094108   0.006703  14.040
## RegionSub-Saharan Africa       -0.069107   0.008259  -8.367
plot_model(slope_mod, type="re",
           vline.color="#A9A9A9", dot.size=1.5,
           show.values=T, value.offset=.2, grid = F) +   
  theme_hc() + 
  theme(strip.text= element_text(size = 6)) + 
  theme(text = element_text(size=6),
        axis.text = element_text(size = 6))

From this model, there’s a pretty clear pattern of later immigrants who came to the United States later making less money. However, it is also the case that immigrants make more money the longer they stay, so it is possible that for immigrants who came to the Untied States more recently might catch up to those who came before, but the downward trajectory is not promising. For example, look at the earnings of immigrants who have been in the United States 10 years or less. Those that came between 1984-1994 are twice as high in the distribution as those who came from 2005-2014 and 2014 and after.

Another way to specify this model is random slopes. I can look at the average income by year of immigration for each group, and also look at the growth in their income by the number of years they are in the US. In order to implement this model, I want the years in the US to always start at 0. To do this, for each immigration year group, I will subtract years in the US from the minimum of years of each immigration year group.

data = data %>% 
  group_by(ImmigrationYearBucket) %>%
  mutate(
    YearsInUS = YearsInUS - min(YearsInUS, na.rm =T)
  ) %>% ungroup()
slopes_mod = lmer(INCWAGEInverse ~ YEAR + AgeBucket + Education + Gender + Region +
                   (YearsInUS|ImmigrationYearBucket), 
           data = data)
summary(slopes_mod)
## Linear mixed model fit by REML ['lmerMod']
## Formula: INCWAGEInverse ~ YEAR + AgeBucket + Education + Gender + Region +  
##     (YearsInUS | ImmigrationYearBucket)
##    Data: data
## 
## REML criterion at convergence: 902972.1
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -6.5682 -0.5752  0.0797  0.6589  6.1205 
## 
## Random effects:
##  Groups                Name        Variance  Std.Dev. Corr 
##  ImmigrationYearBucket (Intercept) 0.1082718 0.32905       
##                        YearsInUS   0.0003524 0.01877  -0.94
##  Residual                          0.6513435 0.80706       
## Number of obs: 374637, groups:  ImmigrationYearBucket, 8
## 
## Fixed effects:
##                                 Estimate Std. Error t value
## (Intercept)                    -1.837684   0.044461 -41.333
## YEAR1995                        0.032931   0.011977   2.750
## YEAR1996                        0.013062   0.012352   1.057
## YEAR1997                        0.023647   0.012242   1.932
## YEAR1998                        0.067593   0.012323   5.485
## YEAR1999                        0.101852   0.012425   8.197
## YEAR2000                        0.110926   0.012350   8.982
## YEAR2001                        0.169091   0.011645  14.520
## YEAR2002                        0.181732   0.011826  15.367
## YEAR2003                        0.163069   0.012054  13.528
## YEAR2004                        0.155467   0.012275  12.666
## YEAR2005                        0.159494   0.012679  12.579
## YEAR2006                        0.166298   0.012786  13.007
## YEAR2007                        0.179626   0.013034  13.781
## YEAR2008                        0.155134   0.013344  11.626
## YEAR2009                        0.144282   0.013646  10.573
## YEAR2010                        0.101447   0.013944   7.276
## YEAR2011                        0.062841   0.014253   4.409
## YEAR2012                        0.054114   0.014589   3.709
## YEAR2013                        0.064578   0.014870   4.343
## YEAR2014                        0.077910   0.015182   5.132
## YEAR2015                        0.103419   0.015639   6.613
## YEAR2016                        0.129696   0.015970   8.121
## YEAR2017                        0.143118   0.016272   8.795
## YEAR2018                        0.135785   0.016642   8.159
## YEAR2019                        0.154107   0.016971   9.081
## AgeBucket18 - 24                0.661649   0.015664  42.239
## AgeBucket25 - 34                1.196519   0.015296  78.224
## AgeBucket35 - 44                1.338745   0.015298  87.513
## AgeBucket45 - 54                1.332045   0.015409  86.448
## AgeBucket55 - 64                1.250018   0.015707  79.581
## AgeBucket65+                    0.912475   0.017044  53.537
## EducationHigh school            0.254350   0.003740  68.008
## EducationSome college           0.392914   0.004171  94.196
## EducationCollege                0.821143   0.004472 183.610
## EducationPostgraduate           1.245879   0.005201 239.549
## GenderMale                      0.467229   0.002684 174.088
## RegionEurope/Central Asia       0.047550   0.005112   9.302
## RegionLatin America/Caribbean  -0.091784   0.003825 -23.993
## RegionMiddle East/North Africa -0.047800   0.009541  -5.010
## RegionNorth America             0.080419   0.009196   8.745
## RegionOther                    -0.042912   0.010575  -4.058
## RegionSouth Asia                0.094741   0.006697  14.147
## RegionSub-Saharan Africa       -0.068191   0.008252  -8.264
## convergence code: 0
## Model failed to converge with max|grad| = 1.08517 (tol = 0.002, component 1)
plot_model(slopes_mod, type="re",
           vline.color="#A9A9A9", dot.size=1.5,
           show.values=T, value.offset=.2) +   
  theme_hc() + 
  theme(strip.text= element_text(size = 6)) + 
  theme(text = element_text(size=6),
        axis.text = element_text(size = 6))

Again, we see a consistent decline over time. There is some evidence that immigrant groups who came in later years are having their incomes grow at a faster rate, but it remains to be seen if the growth rate is substantial enough for them to catch up to their counterparts who immigrated earlier.

To leave a comment for the author, please follow the link and comment on their blog: R on Sam Portnow's Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Earnings of Immigrants

Introduction

Look at the data

Clean Data

The Model

Related

Introduction

Look at the data

Clean Data

The Model

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)