354 search results for "pca"

R in big data pipeline

August 15, 2015
By
R in big data pipeline

R is my fabovite tool for research. There are still quite a few things that only R can do or quicker/easier with R. But unfortunately a lot of people think R becomes less powerful at production stage where you really need to make sure all the functionalities run as you planned against incoming big data. Personally, what makes R special in...

Read more »

R News From JSM 2015

August 13, 2015
By
R News From JSM 2015

by Joseph Rickert We can declare 2015 the year that R went mainstream at the JSM. There is no doubt about it, the calculations, visualizations and deep thinking of a great many of the world's statisticians are rendered or expressed in R and the JSM is with the program. In 2013 I was happy to have stumbled into a...

Read more »

Matrix Factorization Comes in Many Flavors: Components, Clusters, Building Blocks and Ideals

August 6, 2015
By
Matrix Factorization Comes in Many Flavors: Components, Clusters, Building Blocks and Ideals

Unsupervised learning is covered in Chapter 14 of The Elements of Statistical Learning. Here we learn about several data reduction techniques including principal component analysis (PCA), K-means clustering, nonnegative matrix factorization (NMF) ...

Read more »

Peeling of group layers.

August 6, 2015
By
Peeling of group layers.

As an experienced dplyr user since almost day one, I thought I knew every aspect of it. But when my new colleague, who is learning dplyr from scratch, asked me to explain the peeling of group layers with summarise, I was like, what? Turns out this actually is a thing. Let me show the example from the dplyr introduction: library(dplyr) library(nycflights13) daily...

Read more »

partools: a Sensible R Package for Large Data Sets

August 5, 2015
By
partools: a Sensible R Package for Large Data Sets

As I mentioned recently, the new, greatly extended version of my partools package is now on CRAN. (The current version on CRAN is 1.1.3, whereas at the time of my previous announcement it was only 1.1.1. Note that Unix is NOT required.) It is my contention that for most R users who work with large … Continue reading...

Read more »

Multivariate Techniques in Python: EcoPy Alpha Launch!

August 3, 2015
By
Multivariate Techniques in Python: EcoPy Alpha Launch!

I’m announcing the alpha launch of EcoPy: Ecological Data Analysis in Python. EcoPy is a Python module that contains a number of  techniques (PCA, CA, CCorA, nMDS, MDS, RDA, etc.) for exploring complex multivariate data. For those of you familiar … Continue reading →

Read more »

Feature Engineering versus Feature Extraction: Game On!

August 3, 2015
By
Feature Engineering versus Feature Extraction: Game On!

"Feature engineering" is a fancy term for making sure that your predictors are encoded in the model in a manner that makes it as easy as possible for the model to achieve good performance. For example, if your have a date field as a predictor and there are larger differences in response for the weekends versus the weekdays, then...

Read more »

Seattle’s Fremont Bridge Bicyclists Again in the News

August 2, 2015
By
Seattle’s Fremont Bridge Bicyclists Again in the News

Back in 2013, David Smith had done analysis of bicycle trips across Seattle’s Fremont bridge. More recently, Jake Vanderplas (creator of Python’s very popular Scikit-learn package) wrote a nice blog post on “Learning Seattle Work habits from bicycle counts” at … Continue reading →

Read more »

R tutorial on the Apply family of functions

July 28, 2015
By
R tutorial on the Apply family of functions

Introduction In our previous tutorial Loops in R: Usage and Alternatives , we discussed one of the most important constructs in programming: the loop.  Eventually we deprecated the usage of loops in R in favor of vectorized functions. In this post we highlight some of the most used vectorized functions: the apply functions. In the present post we show the use The post

Read more »

Pull the (character) strings with stringi 0.5-2

June 23, 2015
By

A reliable string processing toolkit is a must-have for any data scientist. A new release of the stringi package is available on CRAN (please wait a few days for Windows and OS X binary builds). As for now, about 850…Read more ›

Read more »

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training

datasociety

http://www.eoda.de





ODSC

ODSC

CRC R books series





Six Sigma Online Training









Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)