# 352 search results for "PCA"

## Principal Component Analysis using R

February 27, 2016
By

Curse of Dimensionality:One of the most commonly faced problems while dealing with data analytics problem such as recommendation engines, text analytics is high-dimensional and sparse data. At many times, we face a situation where we have a large set of features and fewer data points, or we have data with very high feature vectors. In such scenarios,...

## Nairobi Data Science Meet Up:Finding deep structures in data with Chris Orwa

February 22, 2016
By

I sat down with former rugby school captain whose rugby career was cut short by a shoulder injury while playing for Black Blad at Kenyatta University. It is always a great pleasure to talk to someone who is extremely passionate about what he does and his passion for Data Science was evident during my chat with “BlackOrwa” at iHub...

## Large scale eigenvalue decomposition and SVD with rARPACK

February 21, 2016
By

In January 2016, I was honored to receive an “Honorable Mention” of the John Chambers Award 2016. This article was written for R-bloggers, whose builder, Tal Galili, kindly invited me to write an introduction to the rARPACK package. A Short Story of rARPACK Eigenvalue decomposition is a commonly used technique in numerous statistical problems. For example, principal component analysis (PCA) basically conducts eigenvalue...

## Clustering French Cities (based on Temperatures)

February 11, 2016
By

In order to illustrate hierarchical clustering techniques and k-means, I did borrow François Husson‘s dataset, with monthly average temperature in several French cities. > temp=read.table( + "http://freakonometrics.free.fr/FR_temp.txt", + header=TRUE,dec=",") We have 15 cities, with monthly observations > X=temp > boxplot(X) Since the variance seems to be rather stable, we will not ‘normalize’ the variables here, > apply(X,2,sd) Janv Fevr Mars...

## Clusters of Texts

February 10, 2016
By

Another popular application of classification techniques is on texmining (see e.g. an old post on French president speaches). Consider the following example,  inspired by Nobert Ryciak’s post, with 12 wikipedia pages, on various topics, > library(tm) > library(stringi) > library(proxy) > titles = c("Boosting_(machine_learning)", + "Random_forest", + "K-nearest_neighbors_algorithm", + "Logistic_regression", + "Boston_Bruins", + "Los_Angeles_Lakers", + "Game_of_Thrones", + "House_of_Cards_(U.S._TV_series)", + "True Detective...

## Introduction to Statistical Methods in R

January 18, 2016
By

Data analyses are the product of many different tasks, and statistical methods are one key aspect of any data analysis. There is a common workflow in the related areas of informatics, data mining, data science, machine learning, and statistics. The workflow tasks include data preparation, the development of predictive mathematical models, and the interpretation and Read More ...The...

## RcppParallel: Getting R and C++ to work (some more) in parallel

January 15, 2016
By

(Post by Dirk Eddelbuettel and JJ Allaire) A common theme over the last few decades was that we could afford to simply sit back and let computer (hardware) engineers take care of increases in computing speed thanks to Moore’s law. That same line of thought now frequently points out that we are getting closer and closer

## Mini AI app using TensorFlow and Shiny

January 14, 2016
By

tr;dr Simple image recognition app using TensorFlow and Shiny About My weekend was full of deep learning and AI programming so as a milestone I made a simple image recognition app that: Takes an image input uploaded to Shiny UI Performs image recognition using TensorFlow Plots detected objects and scores in wordcloud App This app is to demonstrate powerful image recognition...

## Revolution R Open Performance Improvements

January 11, 2016
By

Since last year Revolution Analytics has been publishing beta versions of Revolution R Open and finally in April this year they released RRO 8.0.3. The current release is RRO 3.2.2 (naming was adapted to fit the R version it is built upon). This post will give you an introduction on my favorite new features, how The post

## Health Care Indicators in Utah Counties

January 10, 2016
By

The state of Utah (my adopted home) has an Open Data Catalog with lots of interesting data sets, including a collection of health care indicators from 2014 for the 29 counties in Utah. The observations for each county include measurements such as the infant mortality rate, the percent of people who don’t have insurance, what percent of...