# How Many Factors to Retain in Factor Analysis

**Dominique Makowski**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

# The method agreement procedure

When running a factor analysis, one often needs to know how many components / latent variables to retain. Fortunately, many methods exist to statistically answer this question. Unfortunately, there is no consensus on which method to use. Therefore, the `n_factors()`

function, available in the psycho package, performs the **method agreement procedure**: it runs all the routines and returns the number of factors with the highest consensus.

```
<span class="c1"># devtools::install_github("neuropsychology/psycho.R") # Install the last psycho version if needed</span><span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="n">tidyverse</span><span class="p">)</span><span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="n">psycho</span><span class="p">)</span><span class="w">
</span><span class="n">results</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">attitude</span><span class="w"> </span><span class="o">%>%</span><span class="w">
</span><span class="n">psycho</span><span class="o">::</span><span class="n">n_factors</span><span class="p">()</span><span class="w">
</span><span class="n">print</span><span class="p">(</span><span class="n">results</span><span class="p">)</span><span class="w">
</span>
```

`## The choice of 1 factor is supported by 5 (out of 9; 55.56%) methods (Optimal Coordinates, Acceleration Factor, Parallel Analysis, Velicer MAP, VSS Complexity 1).`

We can have an overview of all values by using the `summary`

method.

n.Factors | n.Methods | Eigenvalues | Cum.Variance |
---|---|---|---|

1 | 5 | 3.72 | 0.53 |

2 | 3 | 1.14 | 0.69 |

3 | 1 | 0.85 | 0.81 |

4 | 0 | 0.61 | 0.90 |

5 | 0 | 0.32 | 0.95 |

6 | 0 | 0.22 | 0.98 |

7 | 0 | 0.14 | 1.00 |

And, of course, plot it 🙂

```
<span class="n">plot</span><span class="p">(</span><span class="n">results</span><span class="p">)</span><span class="w">
</span>
```

The plot shows the **number of methods** (in yellow), the **Eigenvalues** (red line) and the cumulative proportion of **explained variance** (blue line).

For more details, we can also extract the final result (the optimal number of factors) for each method:

Method | n_optimal |
---|---|

Optimal Coordinates | 1 |

Acceleration Factor | 1 |

Parallel Analysis | 1 |

Eigenvalues (Kaiser Criterion) | 2 |

Velicer MAP | 1 |

BIC | 2 |

Sample Size Adjusted BIC | 3 |

VSS Complexity 1 | 1 |

VSS Complexity 2 | 2 |

# Tweaking

We can also provide a correlation matrix, as well as changing the rotation and the factoring method.

```
<span class="n">df</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">psycho</span><span class="o">::</span><span class="n">affective</span><span class="w">
</span><span class="n">cor_mat</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">psycho</span><span class="o">::</span><span class="n">correlation</span><span class="p">(</span><span class="n">df</span><span class="p">)</span><span class="w">
</span><span class="n">cor_mat</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">cor_mat</span><span class="o">$</span><span class="n">values</span><span class="o">$</span><span class="n">r</span><span class="w">
</span><span class="n">results</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">cor_mat</span><span class="w"> </span><span class="o">%>%</span><span class="w">
</span><span class="n">psycho</span><span class="o">::</span><span class="n">n_factors</span><span class="p">(</span><span class="n">rotate</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"oblimin"</span><span class="p">,</span><span class="w"> </span><span class="n">fm</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"mle"</span><span class="p">,</span><span class="w"> </span><span class="n">n</span><span class="o">=</span><span class="n">nrow</span><span class="p">(</span><span class="n">df</span><span class="p">))</span><span class="w">
</span><span class="n">print</span><span class="p">(</span><span class="n">results</span><span class="p">)</span><span class="w">
</span>
```

`## The choice of 2 factors is supported by 5 (out of 9; 55.56%) methods (Parallel Analysis, Eigenvalues (Kaiser Criterion), BIC, Sample Size Adjusted BIC, VSS Complexity 2).`

```
<span class="n">plot</span><span class="p">(</span><span class="n">results</span><span class="p">)</span><span class="w">
</span>
```

# Credits

This package helped you? Don’t forget to cite the various packages you used 🙂

You can cite `psycho`

as follows:

- Makowski, (2018).
*The psycho Package: an Efficient and Publishing-Oriented Workflow for Psychological Science*. Journal of Open Source Software, 3(22), 470. https://doi.org/10.21105/joss.00470

**leave a comment**for the author, please follow the link and comment on their blog:

**Dominique Makowski**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.