Figure Aesthetics or Overlays?

May 2, 2017
By
Figure Aesthetics or Overlays?

Tinkering with a new chart type over the weekend, I spotted something rather odd in in my F1 track history charts – what look to be outliers in the form of cars that hadn’t been lapped on that lap appearing behind the lap leader of the next lap, on track. If you count the number

Read more »

New in the tigris package: simple features support and historic shapefiles

May 2, 2017
By
New in the tigris package: simple features support and historic shapefiles

I am excited to announce that tigris 0.5 is now on CRAN. This is a major release that has been in the works for several months. Get it with install.packages("tigris"). One major new feature is support for the simple features data model via the sf R package. sf allows for the representation of spatial objects in...

Read more »

Getting started with data science – recommended resources

May 2, 2017
By
Getting started with data science – recommended resources

An oft asked question is what resources can I recommend for getting started with data science? Here are my recommendations, and if you have others, please put them in the comments! NB Links in this post may be affiliate links The post Getting started with data science – recommended resources appeared first on Locke Data. Locke...

Read more »

Oakland Real Estate Prices – does month of year matter?

Oakland Real Estate Prices – does month of year matter?

I’ve started doing some basic analysis on Oakland real estate prices over the past decade (multi-tenant buildings only).  There’s still a lot to unpack here, but I’m only able to investigate this 30 minutes at a time (new dad life), so I’ll be making lots of short posts on sub-topics.  The first one I wanted

Read more »

Extracting data from Twitter for #machinelearningflashcards

May 1, 2017
By

I’m a fan of Chris Albon’s recent project #machinelearningflashcards on Twitter where generalized topics and methodologies are drawn out with key takeaways. It’s a great approach to sharing concepts about machine learning for everyone a...

Read more »

How to Establish a Web Presence as an R User and Why It’s Important

May 1, 2017
By
How to Establish a Web Presence as an R User and Why It’s Important

If you are a developer using the R environment to do your programming work, you are probably feeling left out and a bit segregated from the rest of the programming industry. It’s true that not many people know about the R language and what its uses are; however, things have been...

Read more »

Taking control of animations in R and demystifying them in the process

Taking control of animations in R and demystifying them in the process

A while ago (a very long time ago some would say) I showed how I had created my logo using R. In that post I left on the bombshell that I would return and show you how it is possible to add some fancy animation to it. The time to do that is now! Duri...

Read more »

Update to autoencoders and anomaly detection with machine learning in fraud analytics

May 1, 2017
By
Update to autoencoders and anomaly detection with machine learning in fraud analytics

This is a reply to Wojciech Indyk’s comment on yesterday’s post on autoencoders and anomaly detection with machine learning in fraud analytics: “I think you can improve the detection of anomalies if you change the training set to the deep-autoen...

Read more »

Visualizing Tennis Grand Slam Winners Performances

May 1, 2017
By
Visualizing Tennis Grand Slam Winners Performances

Data visualization of sports historical results is one of the means by which champions strengths and weaknesses comparison can be outlined. In this tutorial, we show what plots flavors may help in champions performances comparison, timeline visualization, player-to-player and player-to-tournament relationships. We are going to use the Tennis Grand Slam Tournaments results as outlined by Related Post

Read more »

Using Microsoft R with Alteryx

May 1, 2017
By
Using Microsoft R with Alteryx

Alteryx Designer, the self-service analytics workflow tool, recently added integration with Microsoft R. This allows you to train models provided by Microsoft R, and create predictions from them, without needing to write R code — you simply drag-and-drop to create a workflow. In a recent post at the Microsoft R blog, Bharath Sankaranarayan walks through the process of building...

Read more »

Forecasting: Multivariate Regression Exercises (Part-4)

May 1, 2017
By
Forecasting: Multivariate Regression Exercises (Part-4)

In the previous exercises of this series, forecasts were based only on an analysis of the forecast variable. Another approach to forecasting is to use external variables, which serve as predictors. This set of exercises focuses on forecasting with the standard multivariate linear regression. Running regressions may appear straightforward but this method of forecasting is Related exercise sets:

Read more »

Upcoming Talk on Monetizing R Packages

May 1, 2017
By
Upcoming Talk on Monetizing R Packages

In early June I will be speaking at the San Francisco EARL Conference about my experience monetizing my own open source R packages. This is quite... The post Upcoming Talk on Monetizing R Packages appeared first on AriLamstein.com.

Read more »

Prediction intervals for GLMs part II

May 1, 2017
By
Prediction intervals for GLMs part II

One of my more popular answers on StackOverflow concerns the issue of prediction intervals for a generalized linear model (GLM). Comments, even on StackOverflow, aren’t a good place for a discussion so I thought I’d post something hereon my blog that went into a bit more detail as to why, for some common types of GLMs, prediction intervals...

Read more »

RPostgreSQL and schemas

May 1, 2017
By
RPostgreSQL and schemas

The database PostgreSQL can have different schemas. These work like a window for users, where they get to see specific things within a database, e.g. tables. In this post we’ll look at how… Continue reading →

Read more »

Prediction intervals for GLMs part I

May 1, 2017
By
Prediction intervals for GLMs part I

One of my more popular answers on StackOverflow concerns the issue of prediction intervals for a generalized linear model (GLM). My answer really only addresses how to compute confidence intervals for parameters but in the comments I discuss the more substantive points raised by the OP in their question. Lately there’s been a bit of back and forth...

Read more »

A shiny app to convert sports scores

May 1, 2017
By
A shiny app to convert sports scores

I’m a huge sports fan, but I certainly don’t have extended knowledge about all team sports. Sometimes when I hear about scores in a sports I’m not quite “fluent” in, I wonder how they would translate in a sports I know better. I guess many people ask the same question from time to time. For … Continue...

Read more »

Track Concordance Charts

May 1, 2017
By
Track Concordance Charts

Since getting started with generating templated R reports a few weeks ago, I’ve started spending the odd few minutes every race weekend around looking at ways of automating the generation of F1 qualifying and race reports. Im yesterday’s race, some of the commentary focussed on whether MAS had given BOT an “assist” in blocking VET,

Read more »

How to create your first vector in R

May 1, 2017
By
How to create your first vector in R

Are you an expert R programmer? If so, this is *not* for you. This is a short tutorial for R novices, explaining vectors, a basic R data structure. Here’s an example: 10 150 30 45 20.3 And here’s another one: -5 -4 -3 -2 -1 0 1 2 3 still another one: "Darth Vader" "Luke Related exercise sets:

Read more »

An Interactive Geospatial Data Digestion Framework Implemented in R with Shiny — A US County Example

May 1, 2017
By
An Interactive Geospatial Data Digestion Framework Implemented in R with Shiny — A US County Example

Where are the best places to live? How do you answer this question? If you turn to google, there are many "top 10" lists, generated by The post An Interactive Geospatial Data Digestion Framework Implemented in R with Shiny -- A US County Example appeared first on NYC Data Science Academy Blog.

Read more »

Chat with the rOpenSci team at upcoming meetings

May 1, 2017
By

You can find members of the rOpenSci team at various meetings and workshops around the world. Come say 'hi', learn about how our packages can enable your research, or about our onboarding process for contributing new packages, discuss software sustainability or tell us how we can help you do open and reproducible research. Where's rOpenSci? When ...

Read more »

A Movie Lover’s Guide to New York

April 30, 2017
By
A Movie Lover’s Guide to New York

The web application NYC movie locations guide is a tool designed to aid the visualization, analysis and exploration of movies that have been filmed in New The post A Movie Lover’s Guide to New York appeared first on NYC Data Science Academy Blog.

Read more »

Real Estate Investment: Buy to Sell or Buy to Rent?

April 30, 2017
By
Real Estate Investment: Buy to Sell or Buy to Rent?

Real Estate Investment If you are looking for an investment to generate supplemental income at low risk, then real estate is a good option to consider. Just The post Real Estate Investment: Buy to Sell or Buy to Rent? appeared first on NYC Data Science Academy Blog.

Read more »

The Hydro Network-Linked Data Index

April 30, 2017
By
The Hydro Network-Linked Data Index

Introduction The Hydro Network-Linked Data Index (NLDI) is a system that can index data to NHDPlus V2 catchments and offers a search service to discover indexed information. Data linked to the NLDI includes active NWIS stream gages, water quality portal sites, and outlets of HUC12 watersheds. The NLDI is a core product of the

Read more »

Autoencoders and anomaly detection with machine learning in fraud analytics

April 30, 2017
By
Autoencoders and anomaly detection with machine learning in fraud analytics

All my previous posts on machine learning have dealt with supervised learning. But we can also use machine learning for unsupervised learning. The latter are e.g. used for clustering and (non-linear) dimensionality reduction. For this task, I am using...

Read more »

#6: Easiest package registration

April 30, 2017
By

Welcome to the sixth post in the really random R riffs series, or R4 for short. Posts #1 and #2 discussed how to get the now de rigeur package registration information computed. In essence, we pointed to something which R 3.4.0 would have, and provided tricks for accessing it while R 3.3.3 was still R-released. But now...

Read more »

A Traveler’s Guide to the Sky

April 30, 2017
By
A Traveler’s Guide to the Sky

Intro Maybe you’ve been here before. After a long flight, you make your way down to baggage claim, ready to grab your bags and finally make The post A Traveler's Guide to the Sky appeared first on NYC Data Science Academy Blog.

Read more »

Shiny Application Layouts exercises part 3

April 30, 2017
By
Shiny Application Layouts exercises part 3

Shiny Application Layouts-Navbar Page In the third part of our series we will build another small shiny app but use another UI. Specifically we are going to create a Shiny application that includes more than one distinct sub-components each with its own characteristics. For our case we are going to use the cars dataset to Related exercise sets:

Read more »

Patterns in Stock Indices over Time

April 30, 2017
By
Patterns in Stock Indices over Time

The traditional advice given to investors is 'buy and hold'.  I decided to investigate this claim and see what patterns I could find within the major The post Patterns in Stock Indices over Time appeared first on NYC Data Science Academy Blog.

Read more »

Shiny server series part 4: adding user authentication using Auth0

April 30, 2017
By
Shiny server series part 4: adding user authentication using Auth0

This guide is part of a series on setting up your own private server running shiny apps. There are many guides with great advice on how to set up an R shiny server and related software. I try to make a comprehensive guide based in part on these resourc...

Read more »

Sponsors

Mango solutions









Zero Inflated Models and Generalized Linear Mixed Models with R

r-brain.io



www.ama.org/events-training

Quantide: statistical consulting and training

ODSC1

ODSC2

datasociety

http://www.eoda.de





CRC R books series







Six Sigma Online Training





omictools

Contact us if you wish to help support R-bloggers, and place your banner here.