The significance of experience on the salary in Sweden, a comparison between different occupational groups

[This article was first published on R Analystatistics Sweden , and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In my last post, I found that experience has a significant impact on the salary of engineers. Is the significance of experience on wages unique to engineers or are there similar correlations in other occupational groups?

I will use the same model in principal as in my previous post to calculate the significance of age. I will not use sex as an explanatory variable since there are occupational groups that do not have enough data for both genders. I will also use a polynomial of degree three since this provides a significant model fit for some occupational groups.

There are still occupational groups with too little data for regression analysis. More than 30 posts are necessary to fit both age and year.

The R-value from the Anova table is used as the single value to discriminate how much the age and salary correlates. For exploratory analysis, the Anova value seems good enough.

In the figure below I will also use the estimate for the year to see how much the salaries are raised each year for the different occupational groups holding age as constant.

library (tidyverse)
## -- Attaching packages -------------------------------------------- tidyverse 1.2.1 --
## v ggplot2 3.2.0     v purrr   0.3.2
## v tibble  2.1.3     v dplyr   0.8.3
## v tidyr   0.8.3     v stringr 1.4.0
## v readr   1.3.1     v forcats 0.4.0
## -- Conflicts ----------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library (broom)
library (car)
## Loading required package: carData
## 
## Attaching package: 'car'
## The following object is masked from 'package:dplyr':
## 
##     recode
## The following object is masked from 'package:purrr':
## 
##     some
library (polynom)

readfile <- function (file1){read_csv (file1, col_types = cols(), locale = readr::locale (encoding = "latin1"), na = c("..", "NA")) %>%
  gather (starts_with("19"), starts_with("20"), key = "year", value = salary) %>%
  drop_na() %>%
  mutate (year_n = parse_number (year))
}

The data table is downloaded from Statistics Sweden. It is saved as a comma-delimited file without heading, 000000D2.csv, http://www.statistikdatabasen.scb.se/pxweb/en/ssd/.

The table: Average basic salary, monthly salary and women´s salary as a percentage of men´s salary by sector, occupational group (SSYK 2012), sex and age. Year 2014 – 2018 Monthly salary All sectors

tb <- readfile("000000D2.csv") %>%
  rowwise() %>%
  mutate(age_l = unlist(lapply(strsplit(substr(age, 1, 5), "-"), strtoi))[1]) %>%
  rowwise() %>%
  mutate(age_h = unlist(lapply(strsplit(substr(age, 1, 5), "-"), strtoi))[2]) %>%
  mutate(age_n = (age_l + age_h) / 2)

summary_table = 0
anova_table = 0

for (i in unique(tb$`occuptional  (SSYK 2012)`)){
  temp <- filter(tb, `occuptional  (SSYK 2012)` == i)
  if (dim(temp)[1] > 30){
    model <-lm (log(salary) ~ year_n + poly(age_n, 3), data = temp)
    summary_table <- rbind (summary_table, mutate (tidy (summary (model)), ssyk = i))
    anova_table <- rbind (anova_table, mutate (tidy (Anova (model, type = 2)), ssyk = i))
  }
}

merge(summary_table, anova_table, by = "ssyk", all = TRUE) %>%
  filter (term.y == "poly(age_n, 3)") %>%
  filter (term.x == "year_n") %>%
  ggplot () +
    geom_point (mapping = aes(x = estimate, y = statistic.y)) +
    labs(
      x = "Increase in salaries (% / year)",
      y = "F-value for age"
    ) 

The significance of experience on the salary in Sweden, a comparison between different occupational groups, Year 2014 - 2018

Figure 1: The significance of experience on the salary in Sweden, a comparison between different occupational groups, Year 2014 – 2018

The table with all occupational groups sorted by F-value in descending order.

merge(summary_table, anova_table, by = "ssyk", all = TRUE) %>%
  filter (term.y == "poly(age_n, 3)") %>%
  filter (term.x == "year_n") %>%
  select (ssyk, estimate, statistic.y) %>%
  rename (`F-value for age` = statistic.y) %>%
  rename (`Increase in salary` = estimate) %>%
  arrange (desc (`F-value for age`)) %>%
  knitr::kable(
    booktabs = TRUE,
    caption = 'Correlation for F-value (age) and the yearly increase in salaries with age held as constant')
Table 1: Correlation for F-value (age) and the yearly increase in salaries with age held as constant
ssykIncrease in salaryF-value for age
234 Primary- and pre-school teachers0.03455631349.859088
233 Secondary education teachers0.0294574861.331070
532 Personal care workers in health services0.0285338800.259659
336 Police officers0.0284911675.571576
223 Nursing professionals (cont.)0.0303955625.404523
214 Engineering professionals0.0192393612.414362
235 Teaching professionals not elsewhere classified0.0245885578.686817
266 Social work and counselling professionals0.0316617551.888399
221 Medical doctors0.0150176449.792500
251 ICT architects, systems analysts and test managers0.0249600415.590103
534 Attendants, personal assistants and related workers0.0191811406.258604
231 University and higher education teachers0.0254827404.602202
222 Nursing professionals0.0414071371.107319
533 Health care assistants0.0205813345.075594
531 Child care workers and teachers aides0.0219044291.049608
351 ICT operations and user support technicians0.0211211271.091961
159 Other social services managers0.0251218191.570380
211 Physicists and chemists0.0207272186.366824
321 Medical and pharmaceutical technicians0.0288946177.137635
152 Managers in social and curative care0.0387001164.636802
243 Marketing and public relations professionals0.0150173154.310784
723 Machinery mechanics and fitters0.0204993146.299981
125 Sales and marketing managers0.0187356145.732333
141 Primary and secondary schools and adult education managers0.0346753142.578762
341 Social work and religious associate professionals0.0255830137.073911
133 Research and development managers0.0137728135.323107
153 Elderly care managers0.0331514132.163025
242 Organisation analysts, policy administrators and human resource specialists0.0223881132.013557
332 Insurance advisers, sales and purchasing agents0.0176134128.196288
218 Specialists within environmental and health protection0.0258110120.206634
311 Physical and engineering science technicians0.0213202119.371812
422 Client information clerks0.0175877117.057208
411 Office assistants and other secretaries0.0250406115.401389
264 Authors, journalists and linguists0.0158766107.667527
226 Dentists0.023021399.061845
232 Vocational education teachers0.029864793.534293
122 Human resource managers0.036534886.595103
342 Athletes, fitness instructors and recreational workers0.016282586.085107
515 Building caretakers and related workers0.018844385.346469
123 Administration and planning managers0.042365081.886461
137 Production managers in manufacturing0.026799580.958767
227 Naprapaths, physiotherapists, occupational therapists0.021296778.930141
132 Supply, logistics and transport managers0.013555778.186301
817 Wood processing and papermaking plant operators0.028919775.983376
441 Library and filing clerks0.021044975.872685
131 Information and communications technology service managers0.043153775.423080
343 Photographers, interior decorators and entertainers0.033914275.132287
241 Accountants, financial analysts and fund managers0.027062071.204029
216 Architects and surveyors0.024126768.945982
134 Architectural and engineering managers0.023676068.279874
228 Specialists in health care not elsewhere classified0.027283864.426085
213 Biologists, pharmacologists and specialists in agriculture and forestry0.014484963.378555
831 Train operators and related workers0.017798755.404356
334 Administrative and specialized secretaries0.029270252.477105
335 Tax and related government associate professionals0.022700349.850281
224 Psychologists and psychotherapists0.027065547.653074
511 Cabin crew, guides and related workers0.006973647.413185
812 Metal processing and finishing plant operators0.017674347.395879
331 Financial and accounting associate professionals0.022911345.186053
261 Legal professionals0.029294244.569161
819 Process control technicians0.023282543.919550
333 Business services agents0.026302843.327180
961 Recycling collectors0.022503142.772133
312 Construction and manufacturing supervisors0.032202941.767797
516 Other service related workers0.020278441.325733
262 Museum curators and librarians and related professionals0.022865140.378111
265 Creative and performing artists0.025223539.119906
741 Electrical equipment installers and repairers0.022190138.176541
524 Event seller and telemarketers0.020337336.349688
941 Fast-food workers, food preparation assistants0.019957835.998201
815 Machine operators, textile, fur and leather products0.012837233.582965
962 Newspaper distributors, janitors and other service workers0.014195832.540073
136 Production managers in construction and mining0.026482531.006282
834 Mobile plant operators0.025159930.439935
816 Machine operators, food and related products0.019870629.543569
129 Administration and service managers not elsewhere classified0.017168229.032377
212 Mathematicians, actuaries and statisticians0.024077328.949679
352 Broadcasting and audio-visual technicians0.006707928.725776
513 Waiters and bartenders0.021479528.455515
813 Machine operators, chemical and pharmaceutical products0.025455026.563325
151 Health care managers0.021153024.870942
611 Market gardeners and crop growers0.008957323.602904
732 Printing trades workers0.019170423.581610
432 Stores and transport clerks0.021770222.969527
217 Designers0.025206222.823943
161 Financial and insurance managers0.051875821.908728
711 Carpenters, bricklayers and construction workers0.013655520.268520
541 Other surveillance and security workers0.023943819.245270
179 Other services managers not elsewhere classified0.027244817.108091
911 Cleaners and helpers0.017651316.284355
512 Cooks and cold-buffet managers0.027854915.787404
814 Machine operators, rubber, plastic and paper products0.024527515.256042
267 Religious professionals and deacons0.026840711.266331
761 Butchers, bakers and food processors0.015366011.168879
722 Blacksmiths, toolmakers and related trades workers0.019271310.890741
121 Finance managers0.02766439.785317
752 Wood treaters, cabinet-makers and related trades workers0.02691029.779896
713 Painters, Lacquerers, Chimney-sweepers and related trades workers0.02590989.415854
932 Manufacturing labourers0.02663369.113769
522 Shop staff0.02676798.247675
818 Other stationary plant and machine operators0.02377806.983074
344 Driving instructors and other instructors0.02864806.971261
523 Cashiers and related clerks0.00417374.970851
833 Heavy truck and bus drivers0.01883924.786235
912 Washers, window cleaners and other cleaning workers0.03827614.701424
821 Assemblers0.02862191.405402

Let’s check what we have found.

temp <- tb %>%
  filter(`occuptional  (SSYK 2012)` == "234 Primary- and pre-school teachers")
 
temp %>%
  ggplot () +
    geom_point (mapping = aes(x = year_n,y = salary, colour = age)) +
    facet_grid(. ~ sex) +   
    labs(
      x = "Year",
      y = "Salary (SEK/month)"
    ) 

Highest F-value, Primary- and pre-school teachers

Figure 2: Highest F-value, Primary- and pre-school teachers

model <-lm (log(salary) ~ year_n + poly(age_n, 3, raw = T), data = temp)

summod <- tidy(summary (model))

temp %>%
  ggplot () +
    geom_point (mapping = aes(x = age_n,y = age_n * summod$estimate[3] + summod$estimate[4] * age_n^2 + summod$estimate[5] * age_n^3)) +
    labs(
      x = "Age",
      y = "Salary"
    )

Model fit, Primary- and pre-school teachers, Correlation between age and salary

Figure 3: Model fit, Primary- and pre-school teachers, Correlation between age and salary

pdx <- deriv(as.polynomial(c(0, summod$estimate[3], summod$estimate[4], summod$estimate[5])))

temp %>%
  ggplot () + 
    geom_point (mapping = aes(x = age_n, y = summod$estimate[2] + pdx[1] + pdx[2] * age_n + pdx[3] * age_n^2)) +
    labs(
      x = "Age",
      y = "Salary raise (%)"
    )

Model fit, Primary- and pre-school teachers, The derivative for age

Figure 4: Model fit, Primary- and pre-school teachers, The derivative for age

temp <- tb %>%
  filter(`occuptional  (SSYK 2012)` == "821 Assemblers")

temp %>%
  ggplot () +
    geom_point (mapping = aes(x = year_n,y = salary, colour = age)) +
    facet_grid(. ~ sex) +   
    labs(
      x = "Year",
      y = "Salary (SEK/month)"
    ) 

Lowest F-value, Assemblers

Figure 5: Lowest F-value, Assemblers

model <-lm (log(salary) ~ year_n + poly(age_n, 3, raw = T), data = temp)

summod <- tidy(summary (model))

temp %>%
  ggplot () +
    geom_point (mapping = aes(x = age_n,y = age_n * summod$estimate[3] + summod$estimate[4] * age_n^2 + summod$estimate[5] * age_n^3)) +
    labs(
      x = "Age",
      y = "Salary"
    )

Model fit, Assemblers, Correlation between age and salary

Figure 6: Model fit, Assemblers, Correlation between age and salary

pdx <- deriv(as.polynomial(c(0, summod$estimate[3], summod$estimate[4], summod$estimate[5])))

temp %>%
  ggplot () + 
    geom_point (mapping = aes(x = age_n, y = summod$estimate[2] + pdx[1] + pdx[2] * age_n + pdx[3] * age_n^2)) +
    labs(
      x = "Age",
      y = "Salary raise (%)"
    )

Model fit, Assemblers, The derivative for age

Figure 7: Model fit, Assemblers, The derivative for age

temp <- tb %>%
  filter(`occuptional  (SSYK 2012)` == "161 Financial and insurance managers")

temp %>%
  ggplot () +
    geom_point (mapping = aes(x = year_n,y = salary, colour = age)) +
    facet_grid(. ~ sex) + 
      labs(
        x = "Year",
        y = "Salary (SEK/month)"
      ) 

Highest yearly salary increase, Financial and insurance managers

Figure 8: Highest yearly salary increase, Financial and insurance managers

model <- lm (log(salary) ~ year_n + poly(age_n, 3, raw = T), data = temp)

summod <- tidy(summary (model))

temp %>%
  ggplot () +
    geom_point (mapping = aes(x = age_n,y = age_n * summod$estimate[3] + summod$estimate[4] * age_n^2 + summod$estimate[5] * age_n^3)) +
    labs(
      x = "Age",
      y = "Salary"
    )

Model fit, Financial and insurance managers, Correlation between age and salary

Figure 9: Model fit, Financial and insurance managers, Correlation between age and salary

pdx <- deriv(as.polynomial(c(0, summod$estimate[3], summod$estimate[4], summod$estimate[5])))

temp %>%
  ggplot () + 
    geom_point (mapping = aes(x = age_n, y = summod$estimate[2] + pdx[1] + pdx[2] * age_n + pdx[3] * age_n^2)) +
    labs(
      x = "Age",
      y = "Salary raise (%)"
    )

Model fit, Financial and insurance managers, The derivative for age

Figure 10: Model fit, Financial and insurance managers, The derivative for age

temp <- tb %>%
  filter(`occuptional  (SSYK 2012)` == "523 Cashiers and related clerks")
temp %>%
  ggplot () +
    geom_point (mapping = aes(x = year_n,y = salary, colour = age)) +
    facet_grid(. ~ sex) + 
    labs(
      x = "Year",
      y = "Salary (SEK/month)"
  )

Lowest yearly salary increase, Cashiers and related clerks

Figure 11: Lowest yearly salary increase, Cashiers and related clerks

model <-lm (log(salary) ~ year_n + poly(age_n, 3, raw = T), data = temp)

summod <- tidy(summary (model))

temp %>%
  ggplot () +
    geom_point (mapping = aes(x = age_n,y = age_n * summod$estimate[3] + summod$estimate[4] * age_n^2 + summod$estimate[5] * age_n^3)) +
    labs(
      x = "Age",
      y = "Salary"
    )

Model fit, Cashiers and related clerks, Correlation between age and salary

Figure 12: Model fit, Cashiers and related clerks, Correlation between age and salary

pdx <- deriv(as.polynomial(c(0, summod$estimate[3], summod$estimate[4], summod$estimate[5])))

temp %>%
  ggplot () + 
    geom_point (mapping = aes(x = age_n, y = summod$estimate[2] + pdx[1] + pdx[2] * age_n + pdx[3] * age_n^2)) +
    labs(
      x = "Age",
      y = "Salary raise (%)"
    )

Model fit, Cashiers and related clerks, The derivative for age

Figure 13: Model fit, Cashiers and related clerks, The derivative for age

https://www.r-bloggers.com/

https://rweekly.org

To leave a comment for the author, please follow the link and comment on their blog: R Analystatistics Sweden .

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)