The significance of the sector on the salary in Sweden, a comparison between different occupational groups

Posted on February 18, 2020 by n n R Analystatistics Swedenn in R bloggers | 0 Comments

[This article was first published on R Analystatistics Sweden , and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In my last post, I found that the sector has a significant impact on the salary of engineers. Is the significance of the sector unique to engineers or are there similar correlations in other occupational groups?

Statistics Sweden use NUTS (Nomenclature des Unités Territoriales Statistiques), which is the EU’s hierarchical regional division, to specify the regions.

The F-value from the Anova table is used as the single value to discriminate how much the region and salary correlates. For exploratory analysis, the Anova value seems good enough.

First, define libraries and functions.

library (tidyverse) 
## -- Attaching packages ------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.2.1     v purrr   0.3.3
## v tibble  2.1.3     v dplyr   0.8.3
## v tidyr   1.0.2     v stringr 1.4.0
## v readr   1.3.1     v forcats 0.4.0
## -- Conflicts ---------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library (broom) 
library (car)
## Loading required package: carData
## 
## Attaching package: 'car'
## The following object is masked from 'package:dplyr':
## 
##     recode
## The following object is masked from 'package:purrr':
## 
##     some
library (swemaps) # devtools::install_github('reinholdsson/swemaps')
library(sjPlot)
## Registered S3 methods overwritten by 'lme4':
##   method                          from
##   cooks.distance.influence.merMod car 
##   influence.merMod                car 
##   dfbeta.influence.merMod         car 
##   dfbetas.influence.merMod        car
## #refugeeswelcome
readfile <- function (file1){  
  read_csv (file1, col_types = cols(), locale = readr::locale (encoding = "latin1"), na = c("..", "NA")) %>%  
    gather (starts_with("19"), starts_with("20"), key = "year", value = salary) %>%  
    drop_na() %>%  
    mutate (year_n = parse_number (year))
}
nuts <-read_csv ("nuts.csv", col_types = cols(), locale = readr::locale (encoding = "latin1"), na = c("..", "NA")) %>%
  mutate(NUTS2_sh = substr(NUTS2, 1, 4))
## Warning: Missing column names filled in: 'X1' [1]
nuts %>% 
  distinct (NUTS2_en) %>%
  knitr::kable(
    booktabs = TRUE,
    caption = 'Nomenclature des Unités Territoriales Statistiques (NUTS)')

Table 1: Nomenclature des Unités Territoriales Statistiques (NUTS)
NUTS2_en
SE11 Stockholm
SE12 East-Central Sweden
SE21 Småland and islands
SE22 South Sweden
SE23 West Sweden
SE31 North-Central Sweden
SE32 Central Norrland
SE33 Upper Norrland

map_ln_n <- map_ln %>%
  mutate(lnkod_n = as.numeric(lnkod))

The data table is downloaded from Statistics Sweden. It is saved as a comma-delimited file without heading, 000000CG.csv, http://www.statistikdatabasen.scb.se/pxweb/en/ssd/.

I have renamed the file to 000000CG_sector.csv because the filename 000000CG.csv was used in a previous post.

The table: Average basic salary, monthly salary and women´s salary as a percentage of men´s salary by region, sector, occupational group (SSYK 2012) and sex. Year 2014 – 2018 Monthly salary 1-3 public sector 4-5 private sector

Only 17 occupational groups have employees in both the public and the private sector in all regions and both genders.

In the plot and tables, you can also find information on how the increase in salaries per year for each occupational group is affected when the interactions are taken into account.

tb <- readfile ("000000CG_sector.csv") %>%
  left_join(nuts %>% distinct (NUTS2_en, NUTS2_sh), by = c("region" = "NUTS2_en")) 
  
tb_map <- readfile ("000000CG_sector.csv") %>%
  left_join(nuts, by = c("region" = "NUTS2_en")) %>%  
  right_join(map_ln_n, by = c("Länskod" = "lnkod_n"))


summary_table = vector()
anova_table = vector()
for (i in unique(tb$`occuptional  (SSYK 2012)`)){
  temp <- filter(tb, `occuptional  (SSYK 2012)` == i)
  if (dim(temp)[1] > 150){
    model <- lm(log(salary) ~ region + sex + year_n + sector, data = temp)
    summary_table <- rbind (summary_table, mutate (tidy (summary (model)), ssyk = i, interaction = "none"))
    anova_table <- rbind (anova_table, mutate (tidy (Anova (model, type = 2)), ssyk = i, interaction = "none"))  
  
    model <- lm(log(salary) ~ region + sex + year_n * sector, data = temp)
    summary_table <- rbind (summary_table, mutate (tidy (summary (model)), ssyk = i, interaction = "sector and year"))
    anova_table <- rbind (anova_table, mutate (tidy (Anova (model, type = 2)), ssyk = i, interaction = "sector and year"))
    
    model <- lm(log(salary) ~ region + year_n + sex * sector, data = temp)
    summary_table <- rbind (summary_table, mutate (tidy (summary (model)), ssyk = i, interaction = "sex and sector"))
    anova_table <- rbind (anova_table, mutate (tidy (Anova (model, type = 2)), ssyk = i, interaction = "sex and sector"))   
    
    model <- lm(log(salary) ~ region * sector + year_n + sex, data = temp)
    summary_table <- rbind (summary_table, mutate (tidy (summary (model)), ssyk = i, interaction = "region and sector"))
    anova_table <- rbind (anova_table, mutate (tidy (Anova (model, type = 2)), ssyk = i, interaction = "region and sector"))
    
    model <- lm(log(salary) ~ region * sector * year_n * sex, data = temp)
    summary_table <- rbind (summary_table, mutate (tidy (summary (model)), ssyk = i, interaction = "region, sector, year and sex"))
    anova_table <- rbind (anova_table, mutate (tidy (Anova (model, type = 2)), ssyk = i, interaction = "region, sector, year and sex"))    
  }
}
## Note: model has aliased coefficients
##       sums of squares computed by model comparison
anova_table <- anova_table %>% rowwise() %>% mutate(contcol = str_count(term, ":")) 

summary_table <- summary_table %>% rowwise() %>% mutate(contcol = str_count(term, ":"))

merge(summary_table, anova_table, by = c("ssyk", "interaction"), all = TRUE) %>%
  filter (term.x == "year_n") %>%
  filter (term.y == "sector") %>%    
  filter (interaction == "none") %>%
  
  mutate (estimate = (exp(estimate) - 1) * 100) %>%  
  ggplot () +
    geom_point (mapping = aes(x = estimate, y = statistic.y, colour = interaction)) +
    labs(
      x = "Increase in salaries (% / year)",
      y = "F-value for sector"
    )

Figure 1: The significance of the sector on the salary in Sweden, a comparison between different occupational groups, Year 2014 – 2018

merge(summary_table, anova_table, by = c("ssyk", "interaction"), all = TRUE) %>%
  filter (term.x == "year_n") %>%
  filter (contcol.y > 0) %>%    
  # only look at the interactions between all four variables in the case with interaction region, sector, year and sex
  filter (!(contcol.y < 3 & interaction == "region, sector, year and sex")) %>% 
  
  mutate (estimate = (exp(estimate) - 1) * 100) %>%  
  ggplot () +
    geom_point (mapping = aes(x = estimate, y = statistic.y, colour = interaction)) +
    labs(
      x = "Increase in salaries (% / year)",
      y = "F-value for interaction"
    )

The significance of the interaction between sector, region, year and sex on the salary in Sweden, a comparison between different occupational groups, Year 2014 - 2018

Figure 2: The significance of the interaction between sector, region, year and sex on the salary in Sweden, a comparison between different occupational groups, Year 2014 – 2018

The tables with all occupational groups sorted by F-value in descending order.

merge(summary_table, anova_table, c("ssyk", "interaction"), all = TRUE) %>%
  filter (term.x == "year_n") %>%
  filter (term.y == "sector") %>%   
  filter (interaction == "none") %>%
  mutate (estimate = (exp(estimate) - 1) * 100) %>%  
  select (ssyk, estimate, statistic.y, interaction) %>%
  rename (`F-value` = statistic.y) %>%
  rename (`Increase in salary` = estimate) %>%
  arrange (desc (`F-value`)) %>%
  knitr::kable(
    booktabs = TRUE,
    caption = 'Correlation for F-value (sector) and the yearly increase in salaries')

Table 2: Correlation for F-value (sector) and the yearly increase in salaries
ssyk	Increase in salary	F-value	interaction
962 Newspaper distributors, janitors and other service workers	2.109153	606.2792907	none
331 Financial and accounting associate professionals	1.994107	429.7668310	none
411 Office assistants and other secretaries	2.237594	350.7420490	none
251 ICT architects, systems analysts and test managers	2.316554	337.4390522	none
241 Accountants, financial analysts and fund managers	2.142349	325.6103179	none
321 Medical and pharmaceutical technicians	2.609024	160.8870308	none
242 Organisation analysts, policy administrators and human resource specialists	2.064925	160.8111547	none
351 ICT operations and user support technicians	2.310435	121.7693426	none
214 Engineering professionals	2.575584	87.3548855	none
541 Other surveillance and security workers	2.391839	67.5617978	none
432 Stores and transport clerks	1.541572	62.6562426	none
534 Attendants, personal assistants and related workers	1.850523	40.6132964	none
911 Cleaners and helpers	2.087753	28.9529104	none
311 Physical and engineering science technicians	2.743468	23.7179785	none
422 Client information clerks	2.549989	9.7604793	none
941 Fast-food workers, food preparation assistants	2.144353	4.3115207	none
332 Insurance advisers, sales and purchasing agents	2.228746	0.9809486	none

merge(summary_table, anova_table, c("ssyk", "interaction"), all = TRUE) %>%
  filter (term.x == "year_n") %>%
  filter (contcol.y > 0) %>%   
  filter (interaction == "sector and year") %>%
  mutate (estimate = (exp(estimate) - 1) * 100) %>%  
  select (ssyk, estimate, statistic.y, interaction) %>%
  rename (`F-value` = statistic.y) %>%
  rename (`Increase in salary` = estimate) %>%
  arrange (desc (`F-value`)) %>%
  knitr::kable(
    booktabs = TRUE,
    caption = 'Correlation for F-value (sector and year) and the yearly increase in salaries')

Table 3: Correlation for F-value (sector and year) and the yearly increase in salaries
ssyk	Increase in salary	F-value	interaction
422 Client information clerks	3.433516	33.8973413	sector and year
534 Attendants, personal assistants and related workers	2.307155	29.5467642	sector and year
351 ICT operations and user support technicians	3.105104	15.0659029	sector and year
214 Engineering professionals	3.207414	13.0526621	sector and year
962 Newspaper distributors, janitors and other service workers	2.516074	9.1796329	sector and year
311 Physical and engineering science technicians	3.006941	2.5233548	sector and year
332 Insurance advisers, sales and purchasing agents	2.574538	2.3362535	sector and year
321 Medical and pharmaceutical technicians	2.274618	2.1961501	sector and year
411 Office assistants and other secretaries	2.451836	1.8814544	sector and year
241 Accountants, financial analysts and fund managers	2.373536	0.7836435	sector and year
432 Stores and transport clerks	1.346673	0.7147906	sector and year
251 ICT architects, systems analysts and test managers	2.419651	0.4459847	sector and year
541 Other surveillance and security workers	2.331794	0.1848832	sector and year
242 Organisation analysts, policy administrators and human resource specialists	2.194200	0.1846190	sector and year
911 Cleaners and helpers	2.022334	0.1743591	sector and year
941 Fast-food workers, food preparation assistants	2.161504	0.0068543	sector and year
331 Financial and accounting associate professionals	1.969149	0.0050259	sector and year

merge(summary_table, anova_table, c("ssyk", "interaction"), all = TRUE) %>%
  filter (term.x == "year_n") %>%
  filter (contcol.y > 0) %>%   
  filter (interaction == "sex and sector") %>%
  mutate (estimate = (exp(estimate) - 1) * 100) %>%  
  select (ssyk, estimate, statistic.y, interaction) %>%
  rename (`F-value` = statistic.y) %>%
  rename (`Increase in salary` = estimate) %>%
  arrange (desc (`F-value`)) %>%
  knitr::kable(
    booktabs = TRUE,
    caption = 'Correlation for F-value (sex and sector) and the yearly increase in salaries')

Table 4: Correlation for F-value (sex and sector) and the yearly increase in salaries
ssyk	Increase in salary	F-value	interaction
241 Accountants, financial analysts and fund managers	2.133410	79.736907	sex and sector
332 Insurance advisers, sales and purchasing agents	2.228746	75.904108	sex and sector
911 Cleaners and helpers	2.087753	68.455216	sex and sector
331 Financial and accounting associate professionals	1.994107	67.051008	sex and sector
351 ICT operations and user support technicians	2.279052	38.608554	sex and sector
321 Medical and pharmaceutical technicians	2.592312	30.583657	sex and sector
941 Fast-food workers, food preparation assistants	2.144353	22.396698	sex and sector
541 Other surveillance and security workers	2.383471	16.198924	sex and sector
242 Organisation analysts, policy administrators and human resource specialists	2.083508	15.156166	sex and sector
432 Stores and transport clerks	1.541572	13.144544	sex and sector
534 Attendants, personal assistants and related workers	1.850523	11.478531	sex and sector
251 ICT architects, systems analysts and test managers	2.316554	9.349359	sex and sector
411 Office assistants and other secretaries	2.234948	3.109736	sex and sector
962 Newspaper distributors, janitors and other service workers	2.109153	2.668667	sex and sector
311 Physical and engineering science technicians	2.743468	2.370694	sex and sector
214 Engineering professionals	2.575584	1.834672	sex and sector
422 Client information clerks	2.549989	1.337874	sex and sector

merge(summary_table, anova_table, c("ssyk", "interaction"), all = TRUE) %>%
  filter (term.x == "year_n") %>%
  filter (contcol.y > 0) %>%   
  filter (interaction == "region and sector") %>%
  mutate (estimate = (exp(estimate) - 1) * 100) %>%  
  select (ssyk, estimate, statistic.y, interaction) %>%
  rename (`F-value` = statistic.y) %>%
  rename (`Increase in salary` = estimate) %>%
  arrange (desc (`F-value`)) %>%
  knitr::kable(
    booktabs = TRUE,
    caption = 'Correlation for F-value (region and sector) and the yearly increase in salaries')

Table 5: Correlation for F-value (region and sector) and the yearly increase in salaries
ssyk	Increase in salary	F-value	interaction
432 Stores and transport clerks	1.541572	16.7817739	region and sector
251 ICT architects, systems analysts and test managers	2.316554	15.6603633	region and sector
242 Organisation analysts, policy administrators and human resource specialists	1.987167	14.5447961	region and sector
214 Engineering professionals	2.575584	10.9929801	region and sector
321 Medical and pharmaceutical technicians	2.635986	9.9897434	region and sector
331 Financial and accounting associate professionals	1.994107	9.9608534	region and sector
332 Insurance advisers, sales and purchasing agents	2.228746	8.3539434	region and sector
411 Office assistants and other secretaries	2.238072	5.7558819	region and sector
241 Accountants, financial analysts and fund managers	2.143390	5.6578803	region and sector
541 Other surveillance and security workers	2.376311	4.6423707	region and sector
351 ICT operations and user support technicians	2.307531	3.8460603	region and sector
311 Physical and engineering science technicians	2.737061	3.5329715	region and sector
534 Attendants, personal assistants and related workers	1.850523	3.0881293	region and sector
422 Client information clerks	2.549989	2.2991877	region and sector
911 Cleaners and helpers	2.087753	1.4424330	region and sector
962 Newspaper distributors, janitors and other service workers	2.109153	1.3179558	region and sector
941 Fast-food workers, food preparation assistants	2.155241	0.8264589	region and sector

merge(summary_table, anova_table, c("ssyk", "interaction"), all = TRUE) %>%
  filter (term.x == "year_n") %>%
  filter (contcol.y > 1) %>%   
  filter (interaction == "region and sector") %>%
  mutate (estimate = (exp(estimate) - 1) * 100) %>%  
  select (ssyk, estimate, statistic.y, interaction) %>%
  rename (`F-value` = statistic.y) %>%
  rename (`Increase in salary` = estimate) %>%
  arrange (desc (`F-value`)) %>%
  knitr::kable(
    booktabs = TRUE,
    caption = 'Correlation for F-value (region, year and sex) and the yearly increase in salaries')

Table 6: Correlation for F-value (region, year and sex) and the yearly increase in salaries
ssyk	Increase in salary	F-value	interaction

merge(summary_table, anova_table, c("ssyk", "interaction"), all = TRUE) %>%
  filter (term.x == "year_n") %>%
  filter (contcol.y > 1) %>%   
  filter (interaction == "region, sector, year and sex") %>%
  filter (!(contcol.y < 3 & interaction == "region, sector, year and sex")) %>%  
  mutate (estimate = (exp(estimate) - 1) * 100) %>%  
  select (ssyk, estimate, statistic.y, interaction) %>%
  rename (`F-value` = statistic.y) %>%
  rename (`Increase in salary` = estimate) %>%
  arrange (desc (`F-value`)) %>%
  knitr::kable(
    booktabs = TRUE,
    caption = 'Correlation for F-value (region, year and sex) and the yearly increase in salaries')

Table 7: Correlation for F-value (region, year and sex) and the yearly increase in salaries
ssyk	Increase in salary	F-value	interaction
351 ICT operations and user support technicians	5.218326	2.7426159	region, sector, year and sex
411 Office assistants and other secretaries	2.276319	2.6593769	region, sector, year and sex
331 Financial and accounting associate professionals	1.629354	2.4639758	region, sector, year and sex
941 Fast-food workers, food preparation assistants	2.364653	2.2289954	region, sector, year and sex
251 ICT architects, systems analysts and test managers	2.629773	2.2261226	region, sector, year and sex
332 Insurance advisers, sales and purchasing agents	2.036891	2.0868544	region, sector, year and sex
242 Organisation analysts, policy administrators and human resource specialists	1.867532	1.8789097	region, sector, year and sex
541 Other surveillance and security workers	2.147019	1.7672297	region, sector, year and sex
911 Cleaners and helpers	2.270852	1.6781257	region, sector, year and sex
321 Medical and pharmaceutical technicians	2.594389	0.9665480	region, sector, year and sex
241 Accountants, financial analysts and fund managers	2.028698	0.7221792	region, sector, year and sex
432 Stores and transport clerks	1.067457	0.6784005	region, sector, year and sex
962 Newspaper distributors, janitors and other service workers	2.885436	0.5793881	region, sector, year and sex
214 Engineering professionals	3.169526	0.4994580	region, sector, year and sex
311 Physical and engineering science technicians	2.071111	0.4768272	region, sector, year and sex
534 Attendants, personal assistants and related workers	2.349001	0.3542025	region, sector, year and sex
422 Client information clerks	2.489542	0.2528065	region, sector, year and sex

Let’s check what we have found.

temp <- tb %>%
  filter(`occuptional  (SSYK 2012)` == "962 Newspaper distributors, janitors and other service workers")
 
model <-lm (log(salary) ~ year_n + sex + NUTS2_sh + sector, data = temp) 
 
plot_model(model, type = "pred", terms = c("sector"))
## Model has log-transformed response. Back-transforming predictions to original response scale. Standard errors are still on the log-scale.

Figure 3: Highest F-value sector, Newspaper distributors, janitors and other service workers

tb_map %>%
    filter(`occuptional  (SSYK 2012)` == "962 Newspaper distributors, janitors and other service workers") %>%
  ggplot() +
    geom_polygon(mapping = aes(x = ggplot_long, y = ggplot_lat, group = lnkod, fill = salary)) +
      facet_grid(. ~ sector) + 
    coord_equal()

Figure 4: Highest F-value sector, Newspaper distributors, janitors and other service workers

temp <- tb %>%
  filter(`occuptional  (SSYK 2012)` == "332 Insurance advisers, sales and purchasing agents")
 
model <-lm (log(salary) ~ year_n + sex + NUTS2_sh + sector, data = temp) 
 
plot_model(model, type = "pred", terms = c("sector"))
## Model has log-transformed response. Back-transforming predictions to original response scale. Standard errors are still on the log-scale.

Figure 5: Lowest F-value sector, Insurance advisers, sales and purchasing agents

temp <- tb %>%
  filter(`occuptional  (SSYK 2012)` == "241 Accountants, financial analysts and fund managers")
 
model <-lm (log(salary) ~ year_n + sex * sector + NUTS2_sh, data = temp) 
 
plot_model(model, type = "pred", terms = c("sector", "sex"))
## Model has log-transformed response. Back-transforming predictions to original response scale. Standard errors are still on the log-scale.

Figure 6: Highest F-value interaction gender and sector, Accountants, financial analysts and fund managers

tb_map %>%
    filter(`occuptional  (SSYK 2012)` == "241 Accountants, financial analysts and fund managers") %>%
  filter (sector == "1-3 public sector") %>%
  ggplot() +
    geom_polygon(mapping = aes(x = ggplot_long, y = ggplot_lat, group = lnkod, fill = salary)) +
      facet_grid(. ~ sex) + 
    coord_equal()

Figure 7: Highest F-value interaction gender and sector, Accountants, financial analysts and fund managers

tb_map %>%
    filter(`occuptional  (SSYK 2012)` == "241 Accountants, financial analysts and fund managers") %>%
  filter (sector == "4-5 private sector") %>%  
  ggplot() +
    geom_polygon(mapping = aes(x = ggplot_long, y = ggplot_lat, group = lnkod, fill = salary)) +
      facet_grid(. ~ sex) + 
    coord_equal()

Figure 8: Highest F-value interaction gender and sector, Accountants, financial analysts and fund managers

temp <- tb %>%
  filter(`occuptional  (SSYK 2012)` == "422 Client information clerks")
 
model <-lm (log(salary) ~ year_n + sex * sector + NUTS2_sh, data = temp) 
 
plot_model(model, type = "pred", terms = c("sector", "sex"))
## Model has log-transformed response. Back-transforming predictions to original response scale. Standard errors are still on the log-scale.

Figure 9: Lowest F-value interaction gender and sector, Client information clerks

temp <- tb %>%
  filter(`occuptional  (SSYK 2012)` == "422 Client information clerks")
 
model <-lm (log(salary) ~ year_n * sector + NUTS2_sh + sex , data = temp) 
 
plot_model(model, type = "pred", terms = c("year_n", "sector"))
## Model has log-transformed response. Back-transforming predictions to original response scale. Standard errors are still on the log-scale.

Figure 10: Highest F-value interaction year and sector, Client information clerks

tb_map %>%
  filter(`occuptional  (SSYK 2012)` == "422 Client information clerks") %>%
  filter (sector == "1-3 public sector") %>%  
  ggplot() +
    geom_polygon(mapping = aes(x = ggplot_long, y = ggplot_lat, group = lnkod, fill = salary)) +
      facet_grid(. ~ year) + 
    coord_equal()

Figure 11: Highest F-value interaction year and sector, Client information clerks

tb_map %>%
  filter(`occuptional  (SSYK 2012)` == "422 Client information clerks") %>%
  filter (sector == "4-5 private sector") %>%  
  ggplot() +
    geom_polygon(mapping = aes(x = ggplot_long, y = ggplot_lat, group = lnkod, fill = salary)) +
      facet_grid(. ~ year) + 
    coord_equal()

Figure 12: Highest F-value interaction year and sector, Client information clerks

temp <- tb %>%
  filter(`occuptional  (SSYK 2012)` == "331 Financial and accounting associate professionals")
 
model <-lm (log(salary) ~ year_n * sector + NUTS2_sh + sex , data = temp) 
 
plot_model(model, type = "pred", terms = c("year_n", "sector"))
## Model has log-transformed response. Back-transforming predictions to original response scale. Standard errors are still on the log-scale.

Figure 13: Lowest F-value interaction year and sector, Financial and accounting associate professionals

temp <- tb %>%
  filter(`occuptional  (SSYK 2012)` == "432 Stores and transport clerks")
 
model <-lm (log(salary) ~ year_n + sector * NUTS2_sh + sex , data = temp) 
 
plot_model(model, type = "pred", terms = c("NUTS2_sh", "sector"))
## Model has log-transformed response. Back-transforming predictions to original response scale. Standard errors are still on the log-scale.

Figure 14: Highest F-value interaction region and sector, Stores and transport clerks

tb_map %>%
  filter(`occuptional  (SSYK 2012)` == "432 Stores and transport clerks") %>%
  ggplot() +
    geom_polygon(mapping = aes(x = ggplot_long, y = ggplot_lat, group = lnkod, fill = salary)) +
      facet_grid(. ~ sector) + 
    coord_equal()

Figure 15: Highest F-value interaction region and sector, Stores and transport clerks

temp <- tb %>%
  filter(`occuptional  (SSYK 2012)` == "941 Fast-food workers, food preparation assistants")
 
model <-lm (log(salary) ~ year_n + sector * NUTS2_sh + sex , data = temp) 
 
plot_model(model, type = "pred", terms = c("NUTS2_sh", "sector"))
## Model has log-transformed response. Back-transforming predictions to original response scale. Standard errors are still on the log-scale.

Figure 16: Lowest F-value interaction sector and region, Fast-food workers, food preparation assistants

temp <- tb %>%
  filter(`occuptional  (SSYK 2012)` == "351 ICT operations and user support technicians")
 
model <-lm (log(salary) ~ year_n * NUTS2_sh * sex * sector, data = temp) 
 
plot_model(model, type = "pred", terms = c("NUTS2_sh", "year_n", "sector",  "sex"))
## Model has log-transformed response. Back-transforming predictions to original response scale. Standard errors are still on the log-scale.
## Warning: Package `see` needed to plot multiple panels in one integrated figure.
## Please install it by typing `install.packages("see", dependencies = TRUE)` into
## the console.
## [[1]]

Figure 17: Highest F-value interaction sector, region, year and gender, ICT operations and user support technicians

## 
## [[2]]

Figure 18: Highest F-value interaction sector, region, year and gender, ICT operations and user support technicians

tb_map %>%
  filter(`occuptional  (SSYK 2012)` == "351 ICT operations and user support technicians") %>%
  filter (sector == "1-3 public sector") %>%
  ggplot() +
    geom_polygon(mapping = aes(x = ggplot_long, y = ggplot_lat, group = lnkod, fill = salary)) +
      facet_grid(. ~ year) + 
    coord_equal()

Figure 19: Highest F-value interaction sector, region, year and gender, ICT operations and user support technicians

tb_map %>%
    filter(`occuptional  (SSYK 2012)` == "351 ICT operations and user support technicians") %>%
  filter (sector == "4-5 private sector") %>%  
  ggplot() +
    geom_polygon(mapping = aes(x = ggplot_long, y = ggplot_lat, group = lnkod, fill = salary)) +
      facet_grid(. ~ year) + 
    coord_equal()

Figure 20: Highest F-value interaction sector, region, year and gender, ICT operations and user support technicians

  temp <- tb %>%
  filter(`occuptional  (SSYK 2012)` == "422 Client information clerks")
 
model <-lm (log(salary) ~ year_n * NUTS2_sh * sex * sector, data = temp) 
 
plot_model(model, type = "pred", terms = c("NUTS2_sh", "year_n", "sector",  "sex"))
## Model has log-transformed response. Back-transforming predictions to original response scale. Standard errors are still on the log-scale.
## Warning: Package `see` needed to plot multiple panels in one integrated figure.
## Please install it by typing `install.packages("see", dependencies = TRUE)` into
## the console.
## [[1]]

Figure 21: Lowest F-value interaction sector, region, year and gender, Client information clerks

## 
## [[2]]

Figure 22: Lowest F-value interaction sector, region, year and gender, Client information clerks

To leave a comment for the author, please follow the link and comment on their blog: R Analystatistics Sweden .

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

The significance of the sector on the salary in Sweden, a comparison between different occupational groups

Related

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)