# Principal Components Regression in R: Part 2

**Revolutions**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

by John Mount Ph. D.

Data Scientist at Win-Vector LLC

In part 2 of her series on Principal Components Regression Dr. Nina Zumel illustrates so-called *y*-aware techniques. These often neglected methods use the fact that for predictive modeling problems we know the dependent variable, outcome or *y*, so we can use this during data preparation *in addition to* using it during modeling. Dr. Zumel shows the incorporation of *y*-aware preparation into Principal Components Analyses can capture more of the problem structure in fewer variables. Such methods include:

- Effects based variable pruning
- Significance based variable pruning
- Effects based variable scaling.

This recovers more domain structure and leads to better models. Using the foundation set in the first article Dr. Zumel quickly shows how to move from a traditional *x*-only analysis that fails to preserve a domain-specific relation of two variables to outcome to a *y*-aware analysis that preserves the relation. Or in other words how to move away from a middling result where different values of y (rendered as three colors) are hopelessly intermingled when plotted against the first two found latent variables as shown below.

Dr. Zumel shows how to perform a decisive analysis where *y* is somewhat sortable by the each of the first two latent variable *and* the first two latent variables capture complementary effects, making them good mutual candidates for further modeling (as shown below).

Click here (part 2 *y*-aware methods) for the discussion, examples, and references. Part 1 (*x* only methods) can be found here.

**leave a comment**for the author, please follow the link and comment on their blog:

**Revolutions**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.