Monthly Archives: May 2016

Scientific RMarkdown

May 31, 2016
By

Recently, in my own little scientific community bubble there was increasing interest in markdown and its use for science. As a big fan of markdown and espacially rmarkdown, I created the following cheat sheet and shared it at a couple of events. Sinc...

Read more »

heatmaply: interactive heat maps (with R)

May 31, 2016
By

I am pleased to announce heatmaply, my new R package for generating interactive heat maps, based on the plotly R package. tl;dr By running the following 3 lines of code: install.packages("heatmaply") library(heatmaply) heatmaply(mtcars, k_col = 2, k_row = 3) %>% layout(margin = list(l = 130, b = 40)) You will get this output in your browser … Continue reading...

Read more »

Running similar but independent jobs in parallel on Aster with R

May 31, 2016
By
Running similar but independent jobs in parallel on Aster with R

No surprise that Teradata Aster runs each SQL, SQL-MR, and SQL-GR command in parallel on many clusters with distributed data. But when faced with the task of running many similar but independent jobs one has to do extra work to parallelize them in As...

Read more »

Expedia Data Analysis Part 1

May 31, 2016
By
Expedia Data Analysis Part 1

Expedia Hotel Recommendations Hotel Cluster – Mobile – Package Relationship Channel of Marketing Expedia Hotel Recommendations This dataset can be found at Kaggle. We are given logs of visitors at different Expedia sites and are asked to predict the hotel clusters in the test set. Expedia aims to use customer data to improve their hotel … Continue...

Read more »

Happy New Year, Mr. President. Data and Sentiment Analysis of Presidential New Year Speeches

May 31, 2016
By
Happy New Year, Mr. President. Data and Sentiment Analysis of Presidential New Year Speeches

Salvino A. Salvaggio At a moment where many are preparing for the December 31st evening cocktail, the End of Year speech of the President of the Italian Republic is broadcast right on time at 8:30pm. A tradition which came to be with the constitutional establishment...

Read more »

Principal Components Regression in R: Part 3

May 31, 2016
By
Principal Components Regression in R: Part 3

by John Mount Ph. D. Data Scientist at Win-Vector LLC In her series on principal components analysis for regression in R, Win-Vector LLC's Dr. Nina Zumel broke the demonstration down into the following pieces: Part 1: the proper preparation of data and use of principal components analysis (particularly for supervised learning or regression). Part 2: the introduction of y-aware...

Read more »

Predictive Bookmaker Consensus Model for the UEFA Euro 2016

May 31, 2016
By

(By Achim Zeileis) From 10 June to 10 July 2016 the best European football teams will meet in France to determine the European Champion in the UEFA European Championship 2016 tournament. For the first time 24 teams compete, expanding the format from 16 teams as in the previous five Euro tournaments. For forecasting the winning probability of each team...

Read more »

Understanding beta binomial regression (using baseball statistics)

May 31, 2016
By
Understanding beta binomial regression (using baseball statistics)

Previously in this series: Understanding the beta distribution Understanding empirical Bayes estimation Understanding credible intervals Understanding the Bayesian approach to false discovery rates Understanding Bayesian A/B testing In this series we’ve been using the empirical Bayes method to estimate batting averages of baseball players. Empirical Bayes is useful here because when we...

Read more »

QGIS, Open Source GIS & R

May 31, 2016
By
QGIS, Open Source GIS & R

Today’s post is by Kurt Menke, the owner of Bird’s Eye View GIS, a GIS consultancy. Kurt also wrote the book Mastering QGIS. In my latest course (Shapefiles for R Programmers) I briefly introduce people to QGIS. Kurt’s post below gives you a roadmap for learning more.  I come to this blog from a slightly different, The post

Read more »

How to use data analysis for machine learning (example, part 1)

May 31, 2016
By
How to use data analysis for machine learning (example, part 1)

In my last article, I stated that for practitioners (as opposed to theorists), the real prerequisite for machine learning is data analysis, not math. One of the main reasons for making this statement, is that data scientists spend an inordinate amount of time on data analysis. The traditional statement is that data scientists “spend 80% The post

Read more »

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)