Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

## Tags

Survey data remains an integral part of organizational science and rightfully so. With ever-increasing means of data collection brought about by more nuanced and faster technologies, organizations have no shortage of data – but it would be remiss to discount the value of self-report data to better understand the psychology of workers. Alas, not all surveys are created equal, or rather equally well; so, it’s important to utilize scientifically established methods to evaluate them and draw the appropriate inferences from the data collected.

The full survey construction process should include the following:

1. Define the construct and content domain (e.g., emotional intelligence.)
2. Generate items to cover the content domain
3. Assess content validity
5. Exploratory factor analysis
6. Internal consistency reliability analysis (i.e., Cronbach’s alpha)
7. Confirmatory factor analysis
8. Convergent/discriminant validity evidence
9. Criterion validity evidence
10. Replicate steps 6 – 9 in a new sample(s)

In this article, steps 5 and 6 of the survey evaluation process are covered using R. Another post may potentially address later steps (7-9) so be sure to bookmark this page! For insights or recommendations from your friendly neighborhood I-O psychologist regarding the early stages of survey construction, feel free to contact the author. The construct of interest for this scale development project is human-machine preferences.

#load libraries
library(tibble)
library(MVN)


## Import Data

This survey was developed at a research institution and the IRB protocols mandated that the data are not publicly hosted. A completely de-identified version was used for this walkthrough and preprocessed fully before being analyzed, so a glimpse into the data is provided (pun intended).

Note: Some of the survey items were labeled with “_R” signaling that they are reverse coded. This was handled accordingly in the data preprocessing stage as well.

glimpse(dat)
Rows: 381
Columns: 16
$HUM1 3, 4, 1, 4, 3, 4…$ HUM2_R   3, 2, 2, 4, 4, 4…
$HUM3_R 2, 5, 3, 3, 2, 3…$ HUM4     2, 3, 3, 2, 3, 3…
$HUM5 2, 4, 5, 4, 2, 3…$ HUM6_R   2, 2, 1, 2, 2, 3…
$HUM7_R 2, 4, 2, 3, 4, 5…$ HUM8_R   1, 3, 1, 2, 2, 2…
$HUM9_R 4, 2, 3, 4, 3, 3…$ HUM10    2, 2, 2, 4, 2, 3…
$HUM11_R 3, 2, 2, 4, 4, 3…$ HUM12    4, 4, 4, 4, 3, 5…
$HUM13_R 1, 4, 1, 2, 2, 3…$ HUM14_R  3, 4, 2, 4, 3, 3…
$HUM15_R 2, 4, 1, 4, 3, 2…$ HUM16_R  2, 5, 2, 2, 2, 3…

summary(dat)
HUM1           HUM2_R          HUM3_R           HUM4            HUM5
Min.   :1.000   Min.   :1.000   Min.   :1.000   Min.   :1.000   Min.   :1.000
1st Qu.:2.000   1st Qu.:2.000   1st Qu.:2.000   1st Qu.:3.000   1st Qu.:3.000
Median :3.000   Median :3.000   Median :3.000   Median :3.000   Median :4.000
Mean   :2.869   Mean   :3.055   Mean   :2.832   Mean   :3.105   Mean   :3.714
3rd Qu.:4.000   3rd Qu.:4.000   3rd Qu.:4.000   3rd Qu.:4.000   3rd Qu.:5.000
Max.   :5.000   Max.   :5.000   Max.   :5.000   Max.   :5.000   Max.   :5.000
HUM6_R          HUM7_R          HUM8_R          HUM9_R          HUM10
Min.   :1.000   Min.   :1.000   Min.   :1.000   Min.   :1.000   Min.   :1.000
1st Qu.:2.000   1st Qu.:2.000   1st Qu.:1.000   1st Qu.:2.000   1st Qu.:3.000
Median :2.000   Median :3.000   Median :2.000   Median :3.000   Median :3.000
Mean   :2.136   Mean   :2.911   Mean   :1.848   Mean   :2.942   Mean   :3.089
3rd Qu.:3.000   3rd Qu.:4.000   3rd Qu.:2.000   3rd Qu.:4.000   3rd Qu.:4.000
Max.   :5.000   Max.   :5.000   Max.   :5.000   Max.   :5.000   Max.   :5.000
HUM11_R          HUM12          HUM13_R         HUM14_R         HUM15_R
Min.   :1.000   Min.   :1.000   Min.   :1.000   Min.   :1.000   Min.   :1.000
1st Qu.:3.000   1st Qu.:4.000   1st Qu.:2.000   1st Qu.:3.000   1st Qu.:2.000
Median :4.000   Median :4.000   Median :2.000   Median :3.000   Median :3.000
Mean   :3.535   Mean   :4.108   Mean   :2.491   Mean   :3.357   Mean   :3.234
3rd Qu.:4.000   3rd Qu.:5.000   3rd Qu.:3.000   3rd Qu.:4.000   3rd Qu.:4.000
Max.   :5.000   Max.   :5.000   Max.   :5.000   Max.   :5.000   Max.   :5.000
HUM16_R
Min.   :1.000
1st Qu.:2.000
Median :3.000
Mean   :3.045
3rd Qu.:4.000
Max.   :5.000



## Exploratory Data Analysis

### Pairs Plot

The GGally package is an extension of the ubiquitous ggplot2 visualization library and is incredibly poweful. The ggpairs function creates a pairs plot of the survey items.

(pairsPlot = GGally::ggpairs(data = dat,
upper = "blank",
diag = list(continuous = wrap("densityDiag")),
lower = list(continuous = wrap(ggally_smooth_lm)),
title = "Pairs Plot of Human-Machine Items"))


### Correlations

Correlation analyses seek to measure the statistical relationship between two (random) variables. There is a range of techniques used to assess the relationship between varying data types with the most well-known being Pearson’s-product moment correlation. This (parametric) analysis is effective when continuous variables have a linear relationship and follow a normal distribution; however, surveys usually include Likert-type response options (e.g., Strongly agree to Strongly disagree) and modeling the data as ordinal can sometimes lead to more accurate parameter estimates…to an extent – as the number of response options increase, the more likely the data can be modeled as continuous anyway because the impact becomes negligible.

Opinions will vary but my personal threshold for the number of response options before modeling the data as continuous is 6, but best practice is probably to model the data a couple of ways in order to establish the best analysis. Check out this article to learn more data types and modeling distributions.

All of the survey items within the current scale utilized a 5-point Likert-type response format and polychoric correlations were calculated. Polychoric correlations help allay the attenuation that occurs when modeling discretized data by using the more appropriate joint distribution. R’s psych library has the polychoric function along with a plethora of others that are particularly useful for survey analysis.

corrs = psych::polychoric(dat)

#correlation viz
GGally::ggcorr(data = NULL,
cor_matrix = corrs[["rho"]],
size = 2,
hjust = .75,
nbreaks = 7,
palette = "RdYlBu",
label = TRUE,
label_color = "black",
digits = 2,
#label_alpha = .3,
label_round = 2,
label_size = 1.85,
layout.exp = 0.2) +
theme(legend.position = "none")


### Parallel Analysis

Parallel analysis (PA) is a procedure that helps determine the number of factors (EFA) or components (PCA) to extract when employing dimension reduction techniques. The program is based on the Monte Carlo simulation and generates a data set of random numbers with the same sample size and variables/features as the original data. A correlation matrix of the random data is computed and decomposed thus creating corresponding eigenvalues for each factor — when the eigenvalues from the random data are larger than the eigenvalues from the factor analysis, one has evidence supporting that the factor mostly comprised of random noise.

The current data was subjected to the PA and the following scree plot was produced.

The PA proposes that 3-5 factors most effectively explain the underlying structure of the data. This method is better than some of the older guidelines associated with dimensionality reduction such as the Kaiser criterion that was geared more toward PCA.

Note. PA is an iterative process that needs parameter specifications very similar to EFA (i.e., specified correlation, rotation, estimation method, etc.) and some researchers may conduct the analysis after running the EFAs. Irrespective of the order of operations, the outputs should inform one another.

## Exploratory Factor Analysis

Exploratory factor analysis (EFA) is a multivariate approach whose overarching goal is to identify the underlying relationships between measured variables. As briefly mentioned in the PA section, it is entirely based on correlations (the model can account for uncorrelated factors via rotation methods) and is largely used in scale development across disciplines. EFA is but one part of the factor analytic family and a deep dive into the procedure is beyond the scope of this post. Check out UCLA’s link for a practical introduction into the analysis.

An important step in EFA is specifying the number of factors for the model. For this walk-through, the  package’s fa function was used in a loop to run a series of iterative models between 1 and 5 factors. In psychological research, most of the phenomena investigated are related to one another to some extent, and EFA helps parse out groups that are highly related (within-group) but distinct (between-group) from one another. The model specifies the weighted least squares (WLS) estimation method in an effort to obtain more accurate parameter estimates when using polychoric correlations. Ultimately, five models are individually run and stored in a list so the output(s) can be called and compared.

 efa_mods_fun = function(r, n_models = NULL, ...){ if (!is.matrix(r)) stop("r must be a matrix of covariances!") efa_models = list() for (i in seq(n_models)){ efa_models[[i]] = fa(r, n.obs = nrow(dat), nfactors = i, rotate = "oblimin", # n.iter = 1000, fm = "wls", max.iter = 5000) } return(efa_models) } #run series of models; 1:5-factor solutions modsEFA_rnd1 = efa_mods_fun(corrs[["rho"]], n_models = 5) Fit Indices The fit for each model can be compared across a variety of indices. Below, the Chi-squared statistic, Tucker-Lewis Index (TLI), Bayesian Information Criteria (BIC), root mean squared error (RMSEA), and the amount of variance explained by the model are all assessed to determine which model best described the data and is displayed in a neat table using the kableExtra package. To learn more about what the indices measure and what information they convey, visit this link. #visualize table modsFit_rnd1 %>% rownames_to_column() %>% rename( 'Model Solution(s)' = rowname, 'X\u00B2' = a, 'TLI' = b, 'BIC' = c, 'RMSEA' = d, 'Var Explained' = e ) %>% mutate( 'Model Solution(s)' = c('1 Factor', '2 Factors', '3 Factors', '4 Factors', '5 Factors') ) %>% kableExtra::kable('html', booktabs = TRUE, caption = 'EFA Model Fit Indices - Round 1') %>% kable_styling(bootstrap_options = c('striped', 'HOLD_position'), full_width = FALSE, position = 'center') %>% column_spec(1, width = '8cm') %>% pack_rows( index = c('HMPS ' = 5), latex_gap_space = '.70em') %>% row_spec(3, bold = T, color = "white", background = "#D7261E") According to the fit statistics, the 3-factor model best describes the data but the journey does not conclude here because assessing the item level statistics helps determine the structure of the model. Ideally, simple structure is the goal — this means that each item will individually load unto a single factor. When an item loads unto multiple factors it is known as cross-loading. There is nothing inherently “wrong” with cross-loading but for survey development, establishing strict rules provides more benefits in the long run. The cut-off value for a “useful” item loading was set at .45, thus any item that had a loading less than the cut-off was removed before the model was re-run. Note. Because of the estimation method used in EFA, a factor loading for EACH item and FACTOR will be calculated. The closer the loading value is to 1 the better. Factor Loading Diagram psych::fa.diagram(modsEFA_rnd1[[3]], main = "WLS using Poly - Round 1", digits = 3, rsize = .6, esize = 3, size = 5, cex = .6, l.cex = .2, cut = .4, marg = (c(.5, 2.5, 3, .5))) Based on our model, each item cleanly loaded unto a single factor and the only item with a loading less than the specified cut-off value was HUM5. It was removed before estimating the models a second time. Round 2 Most psychometricians recommend removing one item at a time before rerunning the models and calculating fit statistics and item loadings. Unfortunately, I have not developed a streamlined process for this using R (nor has anyone from my very specific Google searches) but perhaps this will be my future contribution to the open source community! After rerunning the models, again the 3-factor solution is optimal. Let’s review the item loadings next to see how the loadings altered. The fa.diagram function provides a good overall view of individual item loadings, but the true beauty of R, although a functional programming language, is its ability to operate from an object-oriented paradigm as well. Each model that was run had its respective output so next, let’s extract the loadings from each model and visualize the loadings using ggplot. #factor loadings of each model modsEFA_loadings = list() #loop for (i in seq_along(modsEFA_rnd2)) { modsEFA_loadings[[i]] = rownames_to_column( round(data.frame( modsEFA_rnd2[[i]][["loadings"]][]), 3), var = "Item") %>% gather(key = "Factor", value = "Loading", -1) } Best Competing Model Visualize the individual item loadings from the best competing model: 3-Factor solution! #viz of factor loadings ggplot(data = modsEFA_loadings[[3]], aes(fct_inorder(Item), abs(Loading), fill = Factor) ) + geom_bar(stat = "identity", width = .8, color = "gray") + coord_flip() + facet_wrap(~ Factor) + #scale_x_discrete(limits = rev(unique(loadings[[1]]))) + labs( title = "Best Competing Model", subtitle = "3-Factor Solution", x = "Item", y = "Loading Strength" ) + theme_gray(base_size = 10) + theme(legend.position = "right") + geom_hline(yintercept = .45, linetype = "dashed", color = "red", size = .65) The red dashed line represents the cut-off value of .45, indicating that anything below the read line is “meaningless” and anything above as “useful.” This visualization also shows the extent to which the items load unto all the factors to help inspect potential cross-loading. We have achieved simple structure since no items are cross-loading. Conclusion Hopefully, this tutorial proves to be insightful for survey analysis. The steps included are by no means perfect and the processes will almost certainly change based on the researchers’ choices (e.g., modeling Pearson correlations vs polychoric, setting a more strict factor loading cut-off value, etc.). Regardless of the analytical decisions, using survey science to explore and analyze the development process is vital (and fun!). All code is hosted on GitHub. Related PostWith great powers come great responsibilities: model checks in Bayesian data analysisGrid Search and Bayesian Hyperparameter Optimization using {tune} and {caret} packagesK-nearest neighbor for prediction of diabetes in NHANESSelecting Categorical Features in Customer Attrition Prediction Using PythonModel Explanation with BMuCaret Shiny Application using the IML and DALEX Packages var vglnk = {key: '949efb41171ac6ec1bf7f206d57e90b8'}; (function(d, t) { var s = d.createElement(t); s.type = 'text/javascript'; s.async = true; // s.defer = true; // s.src = '//cdn.viglink.com/api/vglnk.js'; s.src = 'https://www.r-bloggers.com/wp-content/uploads/2020/08/vglnk.js'; var r = d.getElementsByTagName(t)[0]; r.parentNode.insertBefore(s, r); }(document, 'script')); Related ShareTweet To leave a comment for the author, please follow the link and comment on their blog: R Programming – DataScience+. R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job. Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. 
 
 
 function init() { var vidDefer = document.getElementsByTagName('iframe'); for (var i=0; i<vidDefer.length; i++) { if(vidDefer[i].getAttribute('data-src')) { vidDefer[i].setAttribute('src',vidDefer[i].getAttribute('data-src')); } } } window.onload = init; R bloggers Facebook page Most viewed posts (weekly) How to build your own image recognition app with R! [Part 2] Dynamic Regression (ARIMA) vs. XGBoost 10 Tips and Tricks for Data Scientists Vol.3 R compiler Application-Installation Guide How to write the first for loop in R 5 Ways to Subset a Data Frame in R tidyverse in r – Complete Tutorial Sponsors // https://support.cloudflare.com/hc/en-us/articles/200169436-How-can-I-have-Rocket-Loader-ignore-my-script-s-in-Automatic-Mode- // this must be placed higher. Otherwise it doesn't work. // data-cfasync="false" is for making sure cloudflares' rocketcache doesn't interfeare with this // in this case it only works because it was used at the original script in the text widget function createCookie(name,value,days) { var expires = ""; if (days) { var date = new Date(); date.setTime(date.getTime() + (days*24*60*60*1000)); expires = "; expires=" + date.toUTCString(); } document.cookie = name + "=" + value + expires + "; path=/"; } function readCookie(name) { var nameEQ = name + "="; var ca = document.cookie.split(';'); for(var i=0;i < ca.length;i++) { var c = ca[i]; while (c.charAt(0)==' ') c = c.substring(1,c.length); if (c.indexOf(nameEQ) == 0) return c.substring(nameEQ.length,c.length); } return null; } function eraseCookie(name) { createCookie(name,"",-1); } // no longer use async because of google // async async function readTextFile(file) { // Helps people browse between pages without the need to keep downloading the same // ads txt page everytime. This way, it allows them to use their browser's cache. var random_number = readCookie("ad_random_number_cookie"); if(random_number == null) { var random_number = Math.floor(Math.random()*100*(new Date().getTime()/10000000000)); createCookie("ad_random_number_cookie",random_number,1) } file += '?t='+random_number; var rawFile = new XMLHttpRequest(); rawFile.onreadystatechange = function () { if(rawFile.readyState === 4) { if(rawFile.status === 200 || rawFile.status == 0) { // var allText = rawFile.responseText; // document.write(allText); document.write(rawFile.responseText); } } } rawFile.open("GET", file, false); rawFile.send(null); } // readTextFile('https://raw.githubusercontent.com/Raynos/file-store/master/temp.txt'); readTextFile("https://www.r-bloggers.com/wp-content/uploads/text-widget_anti-cache.txt"); Recent Posts How to run R code in PyCharm? 10 Tips and Tricks for Data Scientists Vol.4 Ten Years vs The Spread II: Calculating publication lag times in R How to clean the datasets in R? Long time, no see: Virtual Lunch Roulette rOpenSci’s R-universe Project R compiler Application-Installation Guide The Easter Bunny is Cashing In 10 Tips and Tricks for Data Scientists Vol.3 Methow Valley Air Quality The top 10 R errors, the 7th one will surprise you tidyverse in r – Complete Tutorial Visual Representation of Text Data Sets using the R tm and wordcloud packages: part one, Beginner’s Guide Not so soft softmax Microeconomic Theory and Linear Regression (Part 1) Jobs for R-usersJunior Data Scientist / Quantitative economistSenior Quantitative AnalystR programmerData Scientist – CGIAR Excellence in Agronomy (Ref No: DDG-R4D/DS/1/CG/EA/06/20)Data Analytics Auditor, Future of Audit Lead @ London or Newcastle python-bloggers.com (python/data-science news)How to Predict the Position of Runners in a RaceCreate a Keylogger using PythonWhy most “coding for spreadsheet users” training failsNot so soft softmaxRedact Name Entities with SpaCyHow to Redact PII Data using AWS ComprehendCompatibility of nnetsauce and mlsauce with scikit-learn Full list of contributing R-bloggers Archives Archives Select Month April 2021  (21) March 2021  (194) February 2021  (205) January 2021  (229) December 2020  (266) November 2020  (244) October 2020  (252) September 2020  (228) August 2020  (207) July 2020  (255) June 2020  (221) May 2020  (319) April 2020  (319) March 2020  (272) February 2020  (250) January 2020  (249) December 2019  (238) November 2019  (216) October 2019  (230) September 2019  (233) August 2019  (271) July 2019  (257) June 2019  (243) May 2019  (276) April 2019  (293) March 2019  (308) February 2019  (262) January 2019  (287) December 2018  (257) November 2018  (289) October 2018  (308) September 2018  (291) August 2018  (270) July 2018  (333) June 2018  (298) May 2018  (321) April 2018  (301) March 2018  (291) February 2018  (241) January 2018  (330) December 2017  (261) November 2017  (270) October 2017  (290) September 2017  (294) August 2017  (340) July 2017  (283) June 2017  (317) May 2017  (349) April 2017  (324) March 2017  (365) February 2017  (317) January 2017  (367) December 2016  (347) November 2016  (294) October 2016  (306) September 2016  (254) August 2016  (287) July 2016  (327) June 2016  (263) May 2016  (292) April 2016  (260) March 2016  (302) February 2016  (268) January 2016  (337) December 2015  (304) November 2015  (234) October 2015  (259) September 2015  (238) August 2015  (264) July 2015  (243) June 2015  (213) May 2015  (235) April 2015  (211) March 2015  (259) February 2015  (212) January 2015  (245) December 2014  (236) November 2014  (221) October 2014  (218) September 2014  (259) August 2014  (217) July 2014  (235) June 2014  (241) May 2014  (243) April 2014  (260) March 2014  (289) February 2014  (269) January 2014  (263) December 2013  (264) November 2013  (241) October 2013  (234) September 2013  (215) August 2013  (223) July 2013  (254) June 2013  (272) May 2013  (260) April 2013  (279) March 2013  (277) February 2013  (294) January 2013  (343) December 2012  (308) November 2012  (277) October 2012  (308) September 2012  (270) August 2012  (263) July 2012  (247) June 2012  (298) May 2012  (287) April 2012  (295) March 2012  (304) February 2012  (264) January 2012  (280) December 2011  (251) November 2011  (261) October 2011  (281) September 2011  (187) August 2011  (258) July 2011  (219) June 2011  (225) May 2011  (239) April 2011  (268) March 2011  (249) February 2011  (205) January 2011  (209) December 2010  (188) November 2010  (172) October 2010  (219) September 2010  (185) August 2010  (203) July 2010  (175) June 2010  (167) May 2010  (164) April 2010  (152) March 2010  (165) February 2010  (135) January 2010  (121) December 2009  (126) November 2009  (66) October 2009  (87) September 2009  (65) August 2009  (56) July 2009  (64) June 2009  (54) May 2009  (35) April 2009  (38) March 2009  (40) February 2009  (33) January 2009  (42) December 2008  (16) November 2008  (14) October 2008  (10) September 2008  (8) August 2008  (11) July 2008  (7) June 2008  (8) May 2008  (8) April 2008  (4) March 2008  (5) February 2008  (6) January 2008  (10) December 2007  (3) November 2007  (5) October 2007  (9) September 2007  (7) August 2007  (21) July 2007  (9) June 2007  (3) May 2007  (3) April 2007  (1) March 2007  (5) February 2007  (4) November 2006  (1) October 2006  (2) August 2006  (3) July 2006  (1) June 2006  (1) May 2006  (3) April 2006  (1) March 2006  (1) February 2006  (5) January 2006  (1) October 2005  (1) September 2005  (3) May 2005  (1) /* <![CDATA[ */ (function() { var dropdown = document.getElementById( "archives-dropdown-3" ); function onSelectChange() { if ( dropdown.options[ dropdown.selectedIndex ].value !== '' ) { document.location.href = this.options[ this.selectedIndex ].value; } } dropdown.onchange = onSelectChange; })(); /* ]]> */ Other sites Jobs for R-users SAS blogs 
 
 Copyright © 2021 | MH Corporate basic by MH Themes 
 [{"@context":"https:\/\/schema.org","@graph":[{"@type":"Organization","@id":"https:\/\/www.r-bloggers.com#Organization","name":"R-bloggers","url":"http:\/\/www.r-bloggers.com","sameAs":[],"logo":{"@type":"ImageObject","url":"http:\/\/www.r-bloggers.com\/wp-content\/uploads\/2020\/07\/R_blogger_logo_02.png","width":"1061","height":"304"},"contactPoint":{"@type":"ContactPoint","contactType":"technical support","telephone":"","url":"https:\/\/www.r-bloggers.com\/contact-us\/"}},{"@type":"WebSite","@id":"https:\/\/www.r-bloggers.com#website","headline":"R-bloggers","name":"R-bloggers","description":"R news and tutorials contributed by hundreds of R bloggers","url":"https:\/\/www.r-bloggers.com","potentialAction":{"@type":"SearchAction","target":"https:\/\/www.r-bloggers.com\/?s={search_term_string}","query-input":"required name=search_term_string"},"publisher":{"@id":"https:\/\/www.r-bloggers.com#Organization"}},{"@context":"https:\/\/schema.org","@type":"WebPage","@id":"https:\/\/www.r-bloggers.com\/2020\/05\/using-r-to-analyze-evaluate-survey-data-part-1\/#webpage","name":"Using R to Analyze &amp; Evaluate Survey Data \u2013 Part 1","url":"https:\/\/www.r-bloggers.com\/2020\/05\/using-r-to-analyze-evaluate-survey-data-part-1\/","lastReviewed":"2020-05-04T14:06:30-06:00","reviewedBy":{"@type":"Organization","logo":{"@type":"ImageObject","url":"http:\/\/www.r-bloggers.com\/wp-content\/uploads\/2020\/07\/R_blogger_logo_02.png","width":"1061","height":"304"},"name":"R-bloggers"},"inLanguage":"en-US","description":"Related Post","primaryImageOfPage":{"@id":"https:\/\/www.r-bloggers.com\/2020\/05\/using-r-to-analyze-evaluate-survey-data-part-1\/#primaryimage"},"mainContentOfPage":[[{"@context":"https:\/\/schema.org","@type":"SiteNavigationElement","@id":"https:\/\/www.r-bloggers.com\/#top nav","name":"Home","url":"https:\/\/www.r-bloggers.com"},{"@context":"https:\/\/schema.org","@type":"SiteNavigationElement","@id":"https:\/\/www.r-bloggers.com\/#top nav","name":"About","url":"https:\/\/www.r-bloggers.com\/about\/"},{"@context":"https:\/\/schema.org","@type":"SiteNavigationElement","@id":"https:\/\/www.r-bloggers.com\/#top nav","name":"RSS","url":"https:\/\/feeds.feedburner.com\/RBloggers"},{"@context":"https:\/\/schema.org","@type":"SiteNavigationElement","@id":"https:\/\/www.r-bloggers.com\/#top nav","name":"add your blog!","url":"https:\/\/www.r-bloggers.com\/add-your-blog\/"},{"@context":"https:\/\/schema.org","@type":"SiteNavigationElement","@id":"https:\/\/www.r-bloggers.com\/#top nav","name":"Learn R","url":"https:\/\/www.r-bloggers.com\/how-to-learn-r-2\/"},{"@context":"https:\/\/schema.org","@type":"SiteNavigationElement","@id":"https:\/\/www.r-bloggers.com\/#top nav","name":"R jobs","url":"https:\/\/www.r-users.com\/"},{"@context":"https:\/\/schema.org","@type":"SiteNavigationElement","@id":"https:\/\/www.r-bloggers.com\/#top nav","name":"Submit a new job (it's free)","url":"https:\/\/www.r-users.com\/submit-job\/"},{"@context":"https:\/\/schema.org","@type":"SiteNavigationElement","@id":"https:\/\/www.r-bloggers.com\/#top nav","name":"Browse latest jobs (also free)","url":"https:\/\/www.r-users.com\/"},{"@context":"https:\/\/schema.org","@type":"SiteNavigationElement","@id":"https:\/\/www.r-bloggers.com\/#top nav","name":"Contact us","url":"https:\/\/www.r-bloggers.com\/contact-us\/"}]],"isPartOf":{"@id":"https:\/\/www.r-bloggers.com#website"},"breadcrumb":{"@id":"https:\/\/www.r-bloggers.com\/2020\/05\/using-r-to-analyze-evaluate-survey-data-part-1\/#breadcrumb"}},{"@type":"BreadcrumbList","@id":"https:\/\/www.r-bloggers.com\/2020\/05\/using-r-to-analyze-evaluate-survey-data-part-1\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"https:\/\/www.r-bloggers.com","name":"R-bloggers"}},{"@type":"ListItem","position":2,"item":{"@id":"https:\/\/www.r-bloggers.com\/category\/r-bloggers\/","name":"R bloggers"}},{"@type":"ListItem","position":3,"item":{"@id":"https:\/\/www.r-bloggers.com\/2020\/05\/using-r-to-analyze-evaluate-survey-data-part-1\/","name":"Using R to Analyze &amp; Evaluate Survey Data \u2013 Part 1"}}]},{"@type":"Article","@id":"https:\/\/www.r-bloggers.com\/2020\/05\/using-r-to-analyze-evaluate-survey-data-part-1\/#article","url":"https:\/\/www.r-bloggers.com\/2020\/05\/using-r-to-analyze-evaluate-survey-data-part-1\/","inLanguage":"en-US","mainEntityOfPage":"https:\/\/www.r-bloggers.com\/2020\/05\/using-r-to-analyze-evaluate-survey-data-part-1\/#webpage","headline":"Using R to Analyze &amp; Evaluate Survey Data \u2013 Part 1","description":"Related Post","articleBody":"Are you interested in guest posting? Publish at DataScience+ via your RStudio editor.CategoryAdvanced ModelingTagsBest R PackagesData Visualisationggplot2R ProgrammingSurvey data remains an integral part of organizational science and rightfully so. With ever-increasing means of data collection brought about by more nuanced and faster technologies, organizations have no shortage of data &#8211; but it would be remiss to discount the value of self-report data to better understand the psychology of workers. Alas, not all surveys are created equal, or rather equally well; so, it&#8217;s important to utilize scientifically established methods to evaluate them and draw the appropriate inferences from the data collected. The full survey construction process should include the following: 1. Define the construct and content domain (e.g., emotional intelligence.) 2. Generate items to cover the content domain 3. Assess content validity 4. Large scale administration 5. Exploratory factor analysis 6. Internal consistency reliability analysis (i.e., Cronbach&#8217;s alpha) 7. Confirmatory factor analysis 8. Convergent\/discriminant validity evidence 9. Criterion validity evidence 10. Replicate steps 6 &#8211; 9 in a new sample(s) In this article, steps 5 and 6 of the survey evaluation process are covered using R. Another post may potentially address later steps (7-9) so be sure to bookmark this page! For insights or recommendations from your friendly neighborhood I-O psychologist regarding the early stages of survey construction, feel free to contact the author. The construct of interest for this scale development project is human-machine preferences. Load necessary libraries. #load libraries library(tidyverse) #masks stats::filter, lag library(tibble) library(psych) #masks ggpplot2:: %+%, alpha library(GGally) #masks dbplyr::nasa library(kableExtra) #masks dplyr::group_rows library(MVN) Import Data This survey was developed at a research institution and the IRB protocols mandated that the data are not publicly hosted. A completely de-identified version was used for this walkthrough and preprocessed fully before being analyzed, so a glimpse into the data is provided (pun intended). Note: Some of the survey items were labeled with &#8220;_R&#8221; signaling that they are reverse coded. This was handled accordingly in the data preprocessing stage as well. glimpse(dat) Rows: 381 Columns: 16 $HUM1 3, 4, 1, 4, 3, 4\u2026$ HUM2_R 3, 2, 2, 4, 4, 4\u2026 $HUM3_R 2, 5, 3, 3, 2, 3\u2026$ HUM4 2, 3, 3, 2, 3, 3\u2026 $HUM5 2, 4, 5, 4, 2, 3\u2026$ HUM6_R 2, 2, 1, 2, 2, 3\u2026 $HUM7_R 2, 4, 2, 3, 4, 5\u2026$ HUM8_R 1, 3, 1, 2, 2, 2\u2026 $HUM9_R 4, 2, 3, 4, 3, 3\u2026$ HUM10 2, 2, 2, 4, 2, 3\u2026 $HUM11_R 3, 2, 2, 4, 4, 3\u2026$ HUM12 4, 4, 4, 4, 3, 5\u2026 $HUM13_R 1, 4, 1, 2, 2, 3\u2026$ HUM14_R 3, 4, 2, 4, 3, 3\u2026 $HUM15_R 2, 4, 1, 4, 3, 2\u2026$ HUM16_R 2, 5, 2, 2, 2, 3\u2026 summary(dat) HUM1 HUM2_R HUM3_R HUM4 HUM5 Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:3.000 1st Qu.:3.000 Median :3.000 Median :3.000 Median :3.000 Median :3.000 Median :4.000 Mean :2.869 Mean :3.055 Mean :2.832 Mean :3.105 Mean :3.714 3rd Qu.:4.000 3rd Qu.:4.000 3rd Qu.:4.000 3rd Qu.:4.000 3rd Qu.:5.000 Max. :5.000 Max. :5.000 Max. :5.000 Max. :5.000 Max. :5.000 HUM6_R HUM7_R HUM8_R HUM9_R HUM10 Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:1.000 1st Qu.:2.000 1st Qu.:3.000 Median :2.000 Median :3.000 Median :2.000 Median :3.000 Median :3.000 Mean :2.136 Mean :2.911 Mean :1.848 Mean :2.942 Mean :3.089 3rd Qu.:3.000 3rd Qu.:4.000 3rd Qu.:2.000 3rd Qu.:4.000 3rd Qu.:4.000 Max. :5.000 Max. :5.000 Max. :5.000 Max. :5.000 Max. :5.000 HUM11_R HUM12 HUM13_R HUM14_R HUM15_R Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 1st Qu.:3.000 1st Qu.:4.000 1st Qu.:2.000 1st Qu.:3.000 1st Qu.:2.000 Median :4.000 Median :4.000 Median :2.000 Median :3.000 Median :3.000 Mean :3.535 Mean :4.108 Mean :2.491 Mean :3.357 Mean :3.234 3rd Qu.:4.000 3rd Qu.:5.000 3rd Qu.:3.000 3rd Qu.:4.000 3rd Qu.:4.000 Max. :5.000 Max. :5.000 Max. :5.000 Max. :5.000 Max. :5.000 HUM16_R Min. :1.000 1st Qu.:2.000 Median :3.000 Mean :3.045 3rd Qu.:4.000 Max. :5.000 Exploratory Data Analysis Pairs Plot The GGally package is an extension of the ubiquitous ggplot2 visualization library and is incredibly poweful. The ggpairs function creates a pairs plot of the survey items. (pairsPlot GGally::ggpairs(data dat, upper \"blank\", diag list(continuous wrap(\"densityDiag\")), lower list(continuous wrap(ggally_smooth_lm)), title \"Pairs Plot of Human-Machine Items\")) Correlations Correlation analyses seek to measure the statistical relationship between two (random) variables. There is a range of techniques used to assess the relationship between varying data types with the most well-known being Pearson&#8217;s-product moment correlation. This (parametric) analysis is effective when continuous variables have a linear relationship and follow a normal distribution; however, surveys usually include Likert-type response options (e.g., Strongly agree to Strongly disagree) and modeling the data as ordinal can sometimes lead to more accurate parameter estimates&#8230;to an extent &#8211; as the number of response options increase, the more likely the data can be modeled as continuous anyway because the impact becomes negligible. Opinions will vary but my personal threshold for the number of response options before modeling the data as continuous is 6, but best practice is probably to model the data a couple of ways in order to establish the best analysis. Check out this article to learn more data types and modeling distributions. All of the survey items within the current scale utilized a 5-point Likert-type response format and polychoric correlations were calculated. Polychoric correlations help allay the attenuation that occurs when modeling discretized data by using the more appropriate joint distribution. R&#8217;s psych library has the polychoric function along with a plethora of others that are particularly useful for survey analysis. corrs psych::polychoric(dat) #correlation viz GGally::ggcorr(data NULL, cor_matrix corrs], size 2, hjust .75, nbreaks 7, palette \"RdYlBu\", label TRUE, label_color \"black\", digits 2, #label_alpha .3, label_round 2, label_size 1.85, layout.exp 0.2) + theme(legend.position \"none\") Parallel Analysis Parallel analysis (PA) is a procedure that helps determine the number of factors (EFA) or components (PCA) to extract when employing dimension reduction techniques. The program is based on the Monte Carlo simulation and generates a data set of random numbers with the same sample size and variables\/features as the original data. A correlation matrix of the random data is computed and decomposed thus creating corresponding eigenvalues for each factor &#8212; when the eigenvalues from the random data are larger than the eigenvalues from the factor analysis, one has evidence supporting that the factor mostly comprised of random noise. The current data was subjected to the PA and the following scree plot was produced. The PA proposes that 3-5 factors most effectively explain the underlying structure of the data. This method is better than some of the older guidelines associated with dimensionality reduction such as the Kaiser criterion that was geared more toward PCA. Note. PA is an iterative process that needs parameter specifications very similar to EFA (i.e., specified correlation, rotation, estimation method, etc.) and some researchers may conduct the analysis after running the EFAs. Irrespective of the order of operations, the outputs should inform one another. Exploratory Factor Analysis Exploratory factor analysis (EFA) is a multivariate approach whose overarching goal is to identify the underlying relationships between measured variables. As briefly mentioned in the PA section, it is entirely based on correlations (the model can account for uncorrelated factors via rotation methods) and is largely used in scale development across disciplines. EFA is but one part of the factor analytic family and a deep dive into the procedure is beyond the scope of this post. Check out UCLA&#8217;s link for a practical introduction into the analysis. An important step in EFA is specifying the number of factors for the model. For this walk-through, the &lt;psych package&#8217;s fa function was used in a loop to run a series of iterative models between 1 and 5 factors. In psychological research, most of the phenomena investigated are related to one another to some extent, and EFA helps parse out groups that are highly related (within-group) but distinct (between-group) from one another. The model specifies the weighted least squares (WLS) estimation method in an effort to obtain more accurate parameter estimates when using polychoric correlations. Ultimately, five models are individually run and stored in a list so the output(s) can be called and compared. efa_mods_fun function(r, n_models NULL, ...){ if (!is.matrix(r)) stop(\"r must be a matrix of covariances!\") efa_models list() for (i in seq(n_models)){ efa_models] fa(r, n.obs nrow(dat), nfactors i, rotate \"oblimin\", # n.iter 1000, fm \"wls\", max.iter 5000) } return(efa_models) } #run series of models; 1:5-factor solutions modsEFA_rnd1 efa_mods_fun(corrs], n_models 5) Fit Indices The fit for each model can be compared across a variety of indices. Below, the Chi-squared statistic, Tucker-Lewis Index (TLI), Bayesian Information Criteria (BIC), root mean squared error (RMSEA), and the amount of variance explained by the model are all assessed to determine which model best described the data and is displayed in a neat table using the kableExtra package. To learn more about what the indices measure and what information they convey, visit this link. #visualize table modsFit_rnd1 %&gt;% rownames_to_column() %&gt;% rename( 'Model Solution(s)' rowname, 'X\\u00B2' a, 'TLI' b, 'BIC' c, 'RMSEA' d, 'Var Explained' e ) %&gt;% mutate( 'Model Solution(s)' c('1 Factor', '2 Factors', '3 Factors', '4 Factors', '5 Factors') ) %&gt;% kableExtra::kable('html', booktabs TRUE, caption 'EFA Model Fit Indices - Round 1') %&gt;% kable_styling(bootstrap_options c('striped', 'HOLD_position'), full_width FALSE, position 'center') %&gt;% column_spec(1, width '8cm') %&gt;% pack_rows( index c('HMPS ' 5), latex_gap_space '.70em') %&gt;% row_spec(3, bold T, color \"white\", background \"#D7261E\") According to the fit statistics, the 3-factor model best describes the data but the journey does not conclude here because assessing the item level statistics helps determine the structure of the model. Ideally, simple structure is the goal &#8212; this means that each item will individually load unto a single factor. When an item loads unto multiple factors it is known as cross-loading. There is nothing inherently &#8220;wrong&#8221; with cross-loading but for survey development, establishing strict rules provides more benefits in the long run. The cut-off value for a &#8220;useful&#8221; item loading was set at .45, thus any item that had a loading less than the cut-off was removed before the model was re-run. Note. Because of the estimation method used in EFA, a factor loading for EACH item and FACTOR will be calculated. The closer the loading value is to 1 the better. Factor Loading Diagram psych::fa.diagram(modsEFA_rnd1], main \"WLS using Poly - Round 1\", digits 3, rsize .6, esize 3, size 5, cex .6, l.cex .2, cut .4, marg (c(.5, 2.5, 3, .5))) Based on our model, each item cleanly loaded unto a single factor and the only item with a loading less than the specified cut-off value was HUM5. It was removed before estimating the models a second time. Round 2 Most psychometricians recommend removing one item at a time before rerunning the models and calculating fit statistics and item loadings. Unfortunately, I have not developed a streamlined process for this using R (nor has anyone from my very specific Google searches) but perhaps this will be my future contribution to the open source community! After rerunning the models, again the 3-factor solution is optimal. Let&#8217;s review the item loadings next to see how the loadings altered. The fa.diagram function provides a good overall view of individual item loadings, but the true beauty of R, although a functional programming language, is its ability to operate from an object-oriented paradigm as well. Each model that was run had its respective output so next, let&#8217;s extract the loadings from each model and visualize the loadings using ggplot. #factor loadings of each model modsEFA_loadings list() #loop for (i in seq_along(modsEFA_rnd2)) { modsEFA_loadings] rownames_to_column( round(data.frame( modsEFA_rnd2]]), 3), var \"Item\") %&gt;% gather(key \"Factor\", value \"Loading\", -1) } Best Competing Model Visualize the individual item loadings from the best competing model: 3-Factor solution! #viz of factor loadings ggplot(data modsEFA_loadings], aes(fct_inorder(Item), abs(Loading), fill Factor) ) + geom_bar(stat \"identity\", width .8, color \"gray\") + coord_flip() + facet_wrap(~ Factor) + #scale_x_discrete(limits rev(unique(loadings]))) + labs( title \"Best Competing Model\", subtitle \"3-Factor Solution\", x \"Item\", y \"Loading Strength\" ) + theme_gray(base_size 10) + theme(legend.position \"right\") + geom_hline(yintercept .45, linetype \"dashed\", color \"red\", size .65) The red dashed line represents the cut-off value of .45, indicating that anything below the read line is &#8220;meaningless&#8221; and anything above as &#8220;useful.&#8221; This visualization also shows the extent to which the items load unto all the factors to help inspect potential cross-loading. We have achieved simple structure since no items are cross-loading. Conclusion Hopefully, this tutorial proves to be insightful for survey analysis. The steps included are by no means perfect and the processes will almost certainly change based on the researchers&#8217; choices (e.g., modeling Pearson correlations vs polychoric, setting a more strict factor loading cut-off value, etc.). Regardless of the analytical decisions, using survey science to explore and analyze the development process is vital (and fun!). All code is hosted on GitHub. Related PostWith great powers come great responsibilities: model checks in Bayesian data analysisGrid Search and Bayesian Hyperparameter Optimization using {tune} and {caret} packagesK-nearest neighbor for prediction of diabetes in NHANESSelecting Categorical Features in Customer Attrition Prediction Using PythonModel Explanation with BMuCaret Shiny Application using the IML and DALEX Packages","keywords":"","datePublished":"2020-05-04T14:06:30-06:00","dateModified":"2020-05-04T14:06:30-06:00","author":{"@type":"Person","name":"Demetrius K. Green","description":"","url":"https:\/\/www.r-bloggers.com\/author\/demetrius-k-green\/","sameAs":["https:\/\/datascienceplus.com"],"image":{"@type":"ImageObject","url":"https:\/\/secure.gravatar.com\/avatar\/f035e1d311cbd53d00027d38cb5a887d?s=96&d=mm&r=g","height":96,"width":96}},"publisher":{"@id":"https:\/\/www.r-bloggers.com#Organization"},"image":[{"@type":"ImageObject","url":"https:\/\/datascienceplus.com\/wp-content\/uploads\/2020\/04\/unnamed-chunk-100-1-1-490x392.png","width":490,"height":392,"@id":"https:\/\/www.r-bloggers.com\/2020\/05\/using-r-to-analyze-evaluate-survey-data-part-1\/#primaryimage"},{"@type":"ImageObject","url":"https:\/\/datascienceplus.com\/wp-content\/uploads\/2020\/04\/unnamed-chunk-333-1-1-490x392.png","width":490,"height":392},{"@type":"ImageObject","url":"https:\/\/datascienceplus.com\/wp-content\/uploads\/2020\/04\/unnamed-chunk-311-1-490x392.png","width":490,"height":392},{"@type":"ImageObject","url":"https:\/\/datascienceplus.com\/wp-content\/uploads\/2020\/04\/efa_fits_rnd1-490x251.png","width":490,"height":251},{"@type":"ImageObject","url":"https:\/\/datascienceplus.com\/wp-content\/uploads\/2020\/04\/unnamed-chunk-338-1-490x392.png","width":490,"height":392},{"@type":"ImageObject","url":"https:\/\/datascienceplus.com\/wp-content\/uploads\/2020\/04\/efa_fits_rnd2-490x253.png","width":490,"height":253},{"@type":"ImageObject","url":"https:\/\/datascienceplus.com\/wp-content\/uploads\/2020\/04\/unnamed-chunk-342-1-490x392.png","width":490,"height":392},{"@type":"ImageObject","url":"https:\/\/datascienceplus.com\/wp-content\/uploads\/2020\/04\/unnamed-chunk-344-1-490x392.png","width":490,"height":392}],"isPartOf":{"@id":"https:\/\/www.r-bloggers.com\/2020\/05\/using-r-to-analyze-evaluate-survey-data-part-1\/#webpage"}}]}] var snp_f = []; var snp_hostname = new RegExp(location.host); var snp_http = new RegExp("^(http|https)://", "i"); var snp_cookie_prefix = ''; var snp_separate_cookies = false; var snp_ajax_url = 'https://www.r-bloggers.com/wp-admin/admin-ajax.php'; var snp_ajax_nonce = 'af29bc4a2e'; var snp_ignore_cookies = false; var snp_enable_analytics_events = false; var snp_enable_mobile = false; var snp_use_in_all = false; var snp_excluded_urls = []; snp_excluded_urls.push(''); Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.) Click here to close (This popup will not appear again) .snp-pop-109583 .snp-theme6 { max-width: 700px;} .snp-pop-109583 .snp-theme6 h1 {font-size: 17px;} .snp-pop-109583 .snp-theme6 { color: #a0a4a9;} .snp-pop-109583 .snp-theme6 .snp-field ::-webkit-input-placeholder { color: #a0a4a9;} .snp-pop-109583 .snp-theme6 .snp-field :-moz-placeholder { color: #a0a4a9;} .snp-pop-109583 .snp-theme6 .snp-field :-ms-input-placeholder { color: #a0a4a9;} .snp-pop-109583 .snp-theme6 .snp-field input { border: 1px solid #a0a4a9;} .snp-pop-109583 .snp-theme6 .snp-field { color: #000000;} .snp-pop-109583 .snp-theme6 { background: #f2f2f2;} jQuery(document).ready(function() { }); var CaptchaCallback = function() { jQuery('.g-recaptcha').each(function(index, el) { grecaptcha.render(el, { 'sitekey' : '' }); }); }; /* <![CDATA[ */!function(e,n){var r={"selectors":{"block":"pre","inline":"code"},"options":{"indent":4,"ampersandCleanup":true,"linehover":true,"rawcodeDbclick":false,"textOverflow":"scroll","linenumbers":false,"theme":"enlighter","language":"r","retainCssClasses":false,"collapse":false,"toolbarOuter":"","toolbarTop":"{BTN_RAW}{BTN_COPY}{BTN_WINDOW}{BTN_WEBSITE}","toolbarBottom":""},"resources":["https:\/\/www.r-bloggers.com\/wp-content\/plugins\/enlighter\/cache\/enlighterjs.min.css?WvVsFhPO2\/nRUDH","https:\/\/www.r-bloggers.com\/wp-content\/plugins\/enlighter\/resources\/enlighterjs\/enlighterjs.min.js"]},o=document.getElementsByTagName("head")[0],t=n&&(n.error||n.log)||function(){};e.EnlighterJSINIT=function(){!function(e,n){var r=0,l=null;function c(o){l=o,++r==e.length&&(!0,n(l))}e.forEach(function(e){switch(e.match(/\.([a-z]+)(?:[#?].*)?$/)[1]){case"js":var n=document.createElement("script");n.onload=function(){c(null)},n.onerror=c,n.src=e,n.async=!0,o.appendChild(n);break;case"css":var r=document.createElement("link");r.onload=function(){c(null)},r.onerror=c,r.rel="stylesheet",r.type="text/css",r.href=e,r.media="all",o.appendChild(r);break;default:t("Error: invalid file extension",e)}})}(r.resources,function(e){e?t("Error: failed to dynamically load EnlighterJS resources!",e):"undefined"!=typeof EnlighterJS?EnlighterJS.init(r.selectors.block,r.selectors.inline,r.options):t("Error: EnlighterJS resources not loaded yet!")})},(document.querySelector(r.selectors.block)||document.querySelector(r.selectors.inline))&&e.EnlighterJSINIT()}(window,console); /* ]]> */ window.FPConfig= { delay: 0, ignoreKeywords: ["\/wp-admin","\/wp-login.php","\/cart","add-to-cart","logout","#","?",".png",".jpeg",".jpg",".gif",".svg"], maxRPS: 3, hoverDelay: 50 }; _stq = window._stq || []; _stq.push([ 'view', {v:'ext',j:'1:7.3.2',blog:'11524731',post:'196561',tz:'-6',srv:'www.r-bloggers.com'} ]); _stq.push([ 'clickTrackerInit', '11524731', '196561' ]); jQuery(document).ready(function ($) { //$( document ).ajaxStart(function() { //}); for (var i = 0; i < document.forms.length; ++i) { var form = document.forms[i]; if ($(form).attr("method") != "get") { $(form).append('<input type="hidden" name="QSzfYO" value="O@uta2psc7" />'); } if ($(form).attr("method") != "get") { $(form).append('<input type="hidden" name="dH-zmfaLNPF_EWhC" value="IrtXCWGd8_9mkyP4" />'); } if ($(form).attr("method") != "get") { $(form).append('<input type="hidden" name="ohTAcIV" value="P@9fi41Z" />'); } if ($(form).attr("method") != "get") { $(form).append('<input type="hidden" name="KOjGysztocwabEmW" value="4.n*OPmY" />'); } }$(document).on('submit', 'form', function () { if ($(this).attr("method") != "get") {$(this).append('<input type="hidden" name="QSzfYO" value="O@uta2psc7" />'); } if ($(this).attr("method") != "get") {$(this).append('<input type="hidden" name="dH-zmfaLNPF_EWhC" value="IrtXCWGd8_9mkyP4" />'); } if ($(this).attr("method") != "get") {$(this).append('<input type="hidden" name="ohTAcIV" value="P@9fi41Z" />'); } if ($(this).attr("method") != "get") {$(this).append('<input type="hidden" name="KOjGysztocwabEmW" value="4.n*OPmY" />'); } return true; }); jQuery.ajaxSetup({ beforeSend: function (e, data) { //console.log(Object.getOwnPropertyNames(data).sort()); //console.log(data.type); if (data.type !== 'POST') return; if (typeof data.data === 'object' && data.data !== null) { data.data.append("QSzfYO", "O@uta2psc7"); data.data.append("dH-zmfaLNPF_EWhC", "IrtXCWGd8_9mkyP4"); data.data.append("ohTAcIV", "P@9fi41Z"); data.data.append("KOjGysztocwabEmW", "4.n*OPmY"); } else { data.data = data.data + '&QSzfYO=O@uta2psc7&dH-zmfaLNPF_EWhC=IrtXCWGd8_9mkyP4&ohTAcIV=P@9fi41Z&KOjGysztocwabEmW=4.n*OPmY'; } } }); });