354 search results for "pCA"

rearrange() your correlations with corrr

July 20, 2016
By
rearrange() your correlations with corrr

Don’t stare at your correlations in search of variable clusters when you can rearrange() them: library(corrr) mtcars %>% correlate() %>% rearrange() %>% fashion() #> rowname am gear drat wt disp mpg cyl vs hp carb qsec #> 1 am ...

Read more »

Principal Component Analysis Cluster Plots with Plotly

July 19, 2016
By
Principal Component Analysis Cluster Plots with Plotly

The Problem When clustering data using principal component analysis, it is often of interest to visually inspect how well the data points separate in 2-D space based on principal component scores. While this is fairly straightforward to visualize with a scatterplot, the plot can become cluttered quickly with annotations as shown in the following figure:

Read more »

vtreat version 0.5.26 released on CRAN

July 12, 2016
By

Win-Vector LLC, Nina Zumel and I are pleased to announce that ‘vtreat’ version 0.5.26 has been released on CRAN. ‘vtreat’ is a data.frame processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner. (from the package documentation) ‘vtreat’ is an R package that incorporates a number of transforms and simulated out of … Continue reading...

Read more »

The Mathematics of Machine Learning

July 8, 2016
By
The Mathematics of Machine Learning

This post was first published on my Linkedin page and posted here as a contributed post. In the last few months, I have had several people contact me about their enthusiasm for venturing into the world of data science and using Machine Learning (ML) techniques to probe statistical regularities and build impeccable data-driven products. However, I’ve observed that some actually lack...

Read more »

Build your own offshore company

July 6, 2016
By
Build your own offshore company

Hackathons are not alike Recently, a number of this blog’s authors were at a data hackathon, the strangest one we’ve been to so far. It was more of a startup pitch gathering, complete with pitch training and whatnot. I was repeatedly asked by other participants “so, how do you want to monetise your idea?”. My answer was simple: I...

Read more »

In search of an incredible posterior

June 22, 2016
By
In search of an incredible posterior

What is credibility? For over one hundred years 1 actuaries have been wresting with the idea of “credibility”. This is the process whereby one may make a quantitative assessment of the predictive power of sample data. Where necessary, the researcher augments the sample with some exogeneous information - usually more data - to arrive at a final conclusion. In...

Read more »

y-aware scaling in context

June 22, 2016
By

Nina Zumel introduced y-aware scaling in her recent article Principal Components Regression, Pt. 2: Y-Aware Methods. I really encourage you to read the article and add the technique to your repertoire. The method combines well with other methods and can drive better predictive modeling results. From feedback I am not sure everybody noticed that in … Continue reading...

Read more »

Risk Models with Generalized PLS

June 12, 2016
By
Risk Models with Generalized PLS

While developing risk models with hundreds of potential variables, we often run into the situation that risk characteristics or macro-economic indicators are highly correlated, namely multicollinearity. In such cases, we might have to drop variables with high VIFs or employ “variable shrinkage” methods, e.g. lasso or ridge, to suppress variables with colinearity. Feature extraction approaches

Read more »

Why you should read Nina Zumel’s 3 part series on principal components analysis and regression

June 9, 2016
By
Why you should read Nina Zumel’s 3 part series on principal components analysis and regression

Short form: Win-Vector LLC’s Dr. Nina Zumel has a three part series on Principal Components Regression that we think is well worth your time. Part 1: the proper preparation of data (including scaling) and use of principal components analysis (particularly for supervised learning or regression). Part 2: the introduction of y-aware scaling to direct the … Continue reading...

Read more »

What are the Best Machine Learning Packages in R?

June 6, 2016
By
Image 13

Guest post by Khushbu Shah The most common question asked by prospective data scientists is – “What is the best programming language for Machine Learning?” The answer to this question always results in a debate whether to choose R, Python or MATLAB for Machine Learning. Nobody can, in reality, answer the question as to whether Python or R is best...

Read more »

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training

datasociety

http://www.eoda.de





ODSC

ODSC

CRC R books series





Six Sigma Online Training









Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)