# 412 search results for "pca"

## Gotta catch them all

August 21, 2016
By

Introduction When data becomes high-dimensional, the inherent relational structure between the variables can sometimes become unclear or indistinct. One, might want to find clusters for numerous amounts of reasons - me, I want to use it to better unde...

## vtreat 0.5.27 released on CRAN

August 19, 2016
By

Win-Vector LLC, Nina Zumel and I are pleased to announce that ‘vtreat’ version 0.5.27 has been released on CRAN. vtreat is a data.frame processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner. (from the package documentation) Very roughly vtreat accepts an arbitrary “from the wild” data frame (with different column types, … Continue...

## What can we learn from the statistics of the EURO 2016 – Application of factor analysis

July 28, 2016
By

In this post I will try to explain how to perform a factor analysis (FA) on the statistics of the teams in the first round of Euro cup 2016. Meanwhile, I assume that you have enough background on the theory of FA and so I will just stick with the application of this technique. Wikipedia Related Post

## Performing Principal Components Regression (PCR) in R

July 20, 2016
By

Principal components regression (PCR) is a regression method based on Principal Component Analysis: discover how to perform this Data Mining technique in R The post Performing Principal Components Regression (PCR) in R appeared first on MilanoR.

## rearrange() your correlations with corrr

July 20, 2016
By

Don’t stare at your correlations in search of variable clusters when you can rearrange() them: library(corrr) mtcars %>% correlate() %>% rearrange() %>% fashion() #> rowname am gear drat wt disp mpg cyl vs hp carb qsec #> 1 am ...

## Principal Component Analysis Cluster Plots with Plotly

July 19, 2016
By

The Problem When clustering data using principal component analysis, it is often of interest to visually inspect how well the data points separate in 2-D space based on principal component scores. While this is fairly straightforward to visualize with a scatterplot, the plot can become cluttered quickly with annotations as shown in the following figure:

## vtreat version 0.5.26 released on CRAN

July 12, 2016
By

Win-Vector LLC, Nina Zumel and I are pleased to announce that ‘vtreat’ version 0.5.26 has been released on CRAN. ‘vtreat’ is a data.frame processor/conditioner that prepares real-world data for predictive modeling in a statistically sound manner. (from the package documentation) ‘vtreat’ is an R package that incorporates a number of transforms and simulated out of … Continue reading...

## The Mathematics of Machine Learning

July 8, 2016
By

This post was first published on my Linkedin page and posted here as a contributed post. In the last few months, I have had several people contact me about their enthusiasm for venturing into the world of data science and using Machine Learning (ML) techniques to probe statistical regularities and build impeccable data-driven products. However, I’ve observed that some actually lack...

## Build your own offshore company

July 6, 2016
By

Hackathons are not alike Recently, a number of this blog’s authors were at a data hackathon, the strangest one we’ve been to so far. It was more of a startup pitch gathering, complete with pitch training and whatnot. I was repeatedly asked by other participants “so, how do you want to monetise your idea?”. My answer was simple: I...

## Interactive flow visualization in R

June 26, 2016
By

Exploring flows between origins and destinations visually is a common task, but can be difficult to get right. In R, there are many tutorials on the web that show how to produce static flow maps (see here, here, here, and here, among others). Over the past couple years, R developers have created an infrastructure...