Performance differences between RevoScaleR, ColumnStore Table and In-Memory OLTP Table

April 28, 2017
By
Performance differences between RevoScaleR, ColumnStore Table and In-Memory OLTP Table

Running *.XDF files using RevoScaleR computational functions versus have dataset available in Columnstore table or in In-Memory OLTP table will be focus of comparison for this blog post. For this test, I will use the AirLines dataset, available here. Deliberately, I have picked a sample 200 MB (of 13GB dataset) in order to properly test … Continue...

Read more »

Make pleasingly parallel R code with rxExecBy

April 28, 2017
By
Make pleasingly parallel R code with rxExecBy

Some things are easy to convert from a long-running sequential process to a system where each part runs at the same time, thus reducing the required time overall. We...

Read more »

Are the Kids Alright? Trends in Teen Risk Behavior in NYC

April 28, 2017
By
Are the Kids Alright? Trends in Teen Risk Behavior in NYC

Contributed by Brandy Freitas as part of the Spring 2017 NYC Data Science Academy 12-Week Data Science Bootcamp. This post is based on the first class The post

Read more »

Data science for Doctors: Variable importance Exercises

April 28, 2017
By
Data science for Doctors: Variable importance Exercises

Data science enhances people’s decision making. Doctors and researchers are making critical decisions every day. Therefore, it is absolutely necessary for those people to have some basic knowledge of...

Read more »

Machine Learning Classification Using Naive Bayes

April 28, 2017
By
Machine Learning Classification Using Naive Bayes

We will develop a classification exercise using Naive-Bayes algorithm. The exercise was originally published in "Machine Learning in R" by Brett Lantz, PACKT publishing 2015 (open...

Read more »

Retrieving Reading Levels with R

April 28, 2017
By

For those that don't work in education or aren't aware, there is a measurement for a child's reading level called a Lexile ® Level.  There are ways...

Read more »

R Quick Tip: Upload multiple files in shiny and consolidate into a dataset

April 28, 2017
By

In shiny, you can use the fileInput with the parameter multiple = TRUE to enable you to upload multiple files at once. But how do you process those multiple...

Read more »

Beautiful boxplots in base R

April 28, 2017
By
Beautiful boxplots in base R

As many of you will be aware, I like to post some R code, and I especially like to post base R versions of ggplot2 things! Well these amazing...

Read more »

Meetup: Machine Learning in Production with Szilard Pafka

April 28, 2017
By

Machine Learning in Production by Szilard Pafka In this talk I will discuss the main...

Read more »

Salaries by alma mater – an interactive visualization with R and plotly

April 27, 2017
By
Salaries by alma mater – an interactive visualization with R and plotly

Based on an interesting dataset from the Wall Street Journal I made the above visualization of the median starting salary for US college graduates from different undergraduate institutions (I...

Read more »

NY R Conference

April 27, 2017
By
NY R Conference

The 2017 New York R Conference was held last weekend in Manhattan. For the third consecutive year, the organizers -...

Read more »

a secretary problem with maximum ability

April 27, 2017
By
a secretary problem with maximum ability

The Riddler of today has a secretary problem, where one measures sequentially N random variables until one deems the current variable to be the largest of the whole sample....

Read more »

Where Europe lives, in 14 lines of R Code

April 27, 2017
By
Where Europe lives, in 14 lines of R Code

Via Max Galka, always a great source of interesting data visualizations, we have this lovely visualization of population density in Europe in 2011, created by Henrik Lindberg: Impressively, the...

Read more »

Load, Save, and .rda files

April 27, 2017
By
Load, Save, and .rda files

A couple weeks ago I stumbled across a feature in R that I had never heard of before. The functions save(), load(), and the R file type .rda. The...

Read more »

Data Science for Operational Excellence (Part-4)

April 27, 2017
By
Data Science for Operational Excellence (Part-4)

Suppose your friend is a restaurant chain owner (only 3 units) facing some competitors challenges related to low price, lets call it a price war. Inside his business he...

Read more »

Overcome the Fear of Programming

April 27, 2017
By
Overcome the Fear of Programming

By Milind Paradkar You say you never programmed in life before? Never heard of words like Classes and Objects, Dataframe, Methods, Inheritance, Loops? Are you fearful of programming, huh?...

Read more »

Population Lines: How and Why I Created It

April 27, 2017
By
Population Lines: How and Why I Created It

Thanks to the power of Reddit the “Population Lines” print (buy here) I created back in 2013 has attracted a huge amount of interest in the past week or so...

Read more »

Genetic Music: From Schoenberg to Bach

April 27, 2017
By
Genetic Music: From Schoenberg to Bach

Bach, the epitome of a musician who strove all life long and finally acquired the ‘Habit of Perfection’, was a thoroughly imperfect human being (John Eliot Gardiner, Bach: Music...

Read more »

Gender and verbs across 100,000 stories: a tidy analysis

April 27, 2017
By
Gender and verbs across 100,000 stories: a tidy analysis

Previously in this series Examining the arc of 100,000 stories I was fascinated by my colleague Julia Silge’s recent blog post on what verbs tend to occur after...

Read more »

Creating a VIX Futures Term Structure In R From Official CBOE Settlement Data

April 27, 2017
By
Creating a VIX Futures Term Structure In R From Official CBOE Settlement Data

This post will be detailing a process to create a VIX term structure from freely available CBOE VIX settlement data … Continue reading →

Read more »

Welcome to our rOpenSci Interns

Welcome to our rOpenSci Interns

There's a lot of work that goes in to making software: the code that does the thing itself, unit testing, examples, tutorials, documentation, and support. rOpenSci software is created...

Read more »

Euler Problem 18 & 67: Maximum Path Sums

April 26, 2017
By
Euler Problem 18 & 67: Maximum Path Sums

Proposed solution to Euler Problem 18 in the R language. Find the maximum total from top to bottom of a triangle consisting of numbers. Continue reading → The...

Read more »

Using NYC Citi Bike Data to Help Bike Enthusiasts Find their Mate

April 26, 2017
By
Using NYC Citi Bike Data to Help Bike Enthusiasts Find their Mate

There is no shortage of analyses on the NYC bike share system. Most of them aim at predicting the demand for bikes and balancing bike stock, The post

Read more »

Online courses (in R, python, and data science) at Udemy for only $10 – until April 29th

April 26, 2017
By
udemy-november-coupon-2015

In order to get the discount, simply click choose a link below and when paying use the promo code: 10APR303 Udemy is offering readers of R-bloggers access to its global online...

Read more »

qualtRics 1.0 now available from CRAN

April 26, 2017
By
qualtRics 1.0 now available from CRAN

Qualtrics allows users to collect online data through surveys. My R package qualtRics contains convenience functions to pull survey results straight into R using the Qualtrics API...

Read more »

Assorted Shiny apps collection, full code and data

April 26, 2017
By
Assorted Shiny apps collection, full code and data

Here is an assortment of R Shiny apps that you may find useful for exploration if you are in the process of learning Shiny and looking for something different....

Read more »

Shiny Application Layouts Exercises (Part-2)

April 26, 2017
By
Shiny Application Layouts Exercises (Part-2)

SHINY APPLICATION LAYOUTS-PLOT PLUS COLUMNS In the second part of our series we will build another small shiny app but use another UI. More specifically we will present the...

Read more »

dv01 uses R bring greater transparency to the consumer lending market

April 26, 2017
By
dv01 uses R bring greater transparency to the consumer lending market

The founder of the NYC-based startup dv01 watched the 2008 financial crisis and was inspired to bring greater transparency to institutional investors in the consumer lending market. Despite being...

Read more »

Binning Outliers in a Histogram

April 26, 2017
By
Binning Outliers in a Histogram

I guess we all use it, the good old histogram. One of the first things we are taught in Introduction to Statistics and routinely applied whenever coming across a...

Read more »

Sponsors

Mango solutions









Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training



www.ama.org/events-training

ODSC1

ODSC2

datasociety

http://www.eoda.de





CRC R books series







Six Sigma Online Training





Contact us if you wish to help support R-bloggers, and place your banner here.