When are Citi Bikes Faster than Taxis in New York City?

September 26, 2017
By
When are Citi Bikes Faster than Taxis in New York City?

Every day in New York City, millions of commuters take part in a giant race to determine transportation supremacy. Cars, bikes, subways, buses, ferries, and more all compete against one another, but we never get much explicit feedback as to who “wins.” I’ve previously written about NYC’s public taxi data and Citi Bike share data, and it occurred to...

Read more »

Data.Table by Example – Part 1

September 26, 2017
By
Data.Table by Example – Part 1

For many years, I actively avoided the data.table package and preferred to utilize the tools available in either base R or dplyr for data aggregation and exploration. However, over the past year, I have come to realize that this was a mistake. Data tables are incredible and provide R users with a syntatically concise and

Read more »

How to Create an Interactive Infographic

September 25, 2017
By
How to Create an Interactive Infographic

An interactive infographic can be used to communicate a lot of information in an engaging way. With the right tools, they are also relatively straightforward to create. In this post, I show step-by-step how to...

Read more »

Regular Expression Searching within Shiny Selectize Objects

September 25, 2017
By
Regular Expression Searching within Shiny Selectize Objects

regexSelect is a small package that uses Shiny modules to solve a problem in Shiny selectize objects - regular expression (regex) searching. You can quickly filter the values in the selectize object, while being able to add that new regex query to the selectize list. This is great for long lists, since you can return multiple item simultaneously without needing...

Read more »

What is the appropriate population scaling of the Affordable Care Act Funding?

September 25, 2017
By
What is the appropriate population scaling of the Affordable Care Act Funding?

I have been trying to decipher for myself, what is in the current (well, yesterday’s) Graham-Cassidy health care bill. I saw this image on many news outlets a few days ago and my inner hate for pie charts bubbled up. This is a zoom in on the pie chart … From what I can gather, these figures are attempting to...

Read more »

News Roundup from Microsoft Ignite

September 25, 2017
By

It's been a big day for the team here at Microsoft, with a flurry of announcements from the Ignite conference in Orlando. We'll provide more in-depth details in the coming days and weeks, but for now here's a brief roundup of the news related to data science: Microsoft ML Server 9.2 is now available. This is the new name...

Read more »

Custom Level Coding in vtreat

September 25, 2017
By
Custom Level Coding in vtreat

One of the services that the R package vtreat provides is level coding (what we sometimes call impact coding): converting the levels of a categorical variable to a meaningful and concise single numeric variable, rather than coding them as indicator variables (AKA "one-hot encoding"). Level coding can be computationally and statistically preferable to one-hot encoding … Continue reading Custom...

Read more »

Time Series Analysis in R Part 2: Time Series Transformations

September 25, 2017
By
Time Series Analysis in R Part 2: Time Series Transformations

In Part 1 of this series, we got started by looking at the ts object in R and how it represents time series data. In Part 2, I’ll discuss some of the many time series transformation functions that are available in R. This is by no means an exhaustive catalog. If you feel I left Related Post Time Series Analysis...

Read more »

Speeding Up Digital Arachnids

September 25, 2017
By
Speeding Up Digital Arachnids

spiderbar, spiderbar Reads robots rules from afar. Crawls the web, any size; Fetches with respect, never lies. Look Out! Here comes the spiderbar. Is it fast? Listen bud, It's got C++ under the hood. Can you scrape, from a site? Test with can_fetch(), TRUE == alright Hey, there There goes the spiderbar. (Check the end... Continue reading →

Read more »

Survival Analysis with R

September 24, 2017
By
Survival Analysis with R

With roots dating back to at least 1662 when John Graunt, a London merchant, published an extensive set of inferences based on mortality records, survival analysis is one of the oldest subfields of Statistics . Basic life-table methods, including techniques for dealing with censored data, were discovered before 1700 , and in the early eighteenth century, the old masters...

Read more »

Super excited for R promises

September 24, 2017
By
Super excited for R promises

We at Appsilon are excited about RStudio introducing promises in R quite soon which is going to be a huge step forward in programming in R (we have already used futures and similar libraries to run code asynchronously, however this is going to be a sta...

Read more »

eXtremely Boost your machine learning Exercises (Part-1)

September 24, 2017
By
eXtremely Boost your machine learning Exercises (Part-1)

eXtreme Gradient Boosting is a machine learning model which became really popular few years ago after winning several Kaggle competitions. It is very powerful algorithm that use an ensemble of weak learners to obtain a strong learner. Its R implementation is available in xgboost package and it is really worth including into anyone’s machine learning Related exercise sets: Model Evaluation...

Read more »

RcppGSL 0.3.3

September 24, 2017
By

A maintenance update RcppGSL 0.3.3 is now on CRAN. It switched the vignette to the our new pinp package and its two-column pdf default. The RcppGSL package provides an interface from R to the GNU GSL using the Rcpp package. No user-facing new code or...

Read more »

Postgresql + R Sandbox

September 23, 2017
By
Postgresql + R Sandbox

ElephantSQL ElephantSQL offers a free instance of Postgresql, with a limit of 20 MB and 5 concurrent connections. For example, you can upload a shiny application that depends on data from ElephantSQL. You only need to register to the site and automat...

Read more »

RcppCNPy 0.2.7

September 23, 2017
By

A new version of the RcppCNPy package arrived on CRAN yesterday. RcppCNPy provides R with read and write access to NumPy files thanks to the cnpy library by Carl Rogers. This version updates internals for function registration, but otherwise mostly s...

Read more »

RcppClassic 0.9.8

September 23, 2017
By

A bug-fix release RcppClassic 0.9.8 for the very recent 0.9.7 release which fixes a build issue on macOS introduced in 0.9.7. No other changes. Courtesy of CRANberries, there are changes relative to the previous release. Questions, comments etc shoul...

Read more »

Upcoming data preparation and modeling article series

September 23, 2017
By
Upcoming data preparation and modeling article series

I am pleased to announce that vtreat version 0.6.0 is now available to R users on CRAN. vtreat is an excellent way to prepare data for machine learning, statistical inference, and predictive analytic projects. If you are an R user we strongly suggest you incorporate vtreat into your projects. vtreat handles, in a statistically sound … Continue reading Upcoming...

Read more »

Hacking statistics or: How I Learned to Stop Worrying About Calculus and Love Stats Exercises (Part-9)

September 23, 2017
By
Hacking statistics or: How I Learned to Stop Worrying About Calculus and Love Stats Exercises (Part-9)

Statistics are often taught in school by and for people who like Mathematics. As a consequence, in those class emphasis is put on leaning equations, solving calculus problems and creating mathematics models instead of building an intuition for probabilistic problems. But, if you read this, you know a bit of R programming and have access Related exercise sets: Hacking statistics...

Read more »

How Random Forests improve simple Regression Trees?

September 22, 2017
By
How Random Forests improve simple Regression Trees?

By Gabriel Vasconcelos Regression Trees In this post I am going to discuss some features of Regression Trees an Random Forests. Regression Trees are know to be very unstable, in other words, a small change in your data may drastically … Continue reading →

Read more »

Welcome to R/exams

September 22, 2017
By
Welcome to R/exams

Welcome everybody, we are proud to introduce the brand new web page and blog http://www.R-exams.org/. This provides a central access point for the open-source software “exams” implemented in the R system for statistical computing. R/exams is a one-...

Read more »

Big Data Analytics with H20 in R Exercises -Part 1

September 22, 2017
By
Big Data Analytics with H20 in R Exercises -Part 1

We have dabbled with RevoScaleR before , In this exercise we will work with H2O , another high performance R library which can handle big data very effectively .It will be a series of exercises with increasing degree of difficulty . So Please do this in sequence . H2O requires you to have Java installed Related exercise sets: Big Data...

Read more »

Tutorial: Launch a Spark and R cluster with HDInsight

September 22, 2017
By

If you'd like to get started using R with Spark, you'll need to set up a Spark cluster and install R and all the other necessary software on the nodes. A really easy way to achieve that is to launch an HDInsight cluster on Azure, which is just a managed Spark cluster with some useful extra components. You'll just...

Read more »

Multi-Dimensional Reduction and Visualisation with t-SNE

September 22, 2017
By
Multi-Dimensional Reduction and Visualisation with t-SNE

t-SNE is a very powerful technique that can be used for visualising (looking for patterns) in multi-dimensional data. Great things have been said about this technique. In this blog post I did a few experiments with t-SNE in R to learn about this technique and its uses. Its power to visualise complex multi-dimensional data is Related Post Comparing Trump and...

Read more »

My advice on dplyr::mutate()

September 22, 2017
By
My advice on dplyr::mutate()

There are substantial differences between ad-hoc analyses (be they: machine learning research, data science contests, or other demonstrations) and production worthy systems. Roughly: ad-hoc analyses have to be correct only at the moment they are run (and often once they are correct, that is the last time they are run; obviously the idea of reproducible … Continue reading My...

Read more »

Mining USPTO full text patent data – Analysis of machine learning and AI related patents granted in 2017 so far – Part 1

September 21, 2017
By
Mining USPTO full text patent data – Analysis of machine learning and AI related patents granted in 2017 so far – Part 1

The United States Patent and Trademark office (USPTO) provides immense amounts of data (the data I used are in the form of XML files). After coming across these datasets, I thought that it would be a good idea to explore where and how my areas of interest fall into the intellectual property space; my areas of interest being machine...

Read more »

Will Stanton hit 61 home runs this season?

September 21, 2017
By
Will Stanton hit 61 home runs this season?

So far, Giancarlo Stanton has hit 56 home runs in 555 at bats over 149 games. Miami has 10 games left to play. What’s the chance he’ll The post Will Stanton...

Read more »

Pirating Pirate Data for Pirate Day

September 21, 2017
By
Pirating Pirate Data for Pirate Day

This past Tuesday was Talk Like A Pirate Date, the unofficial holiday of R (aRRR!) users worldwide. In recognition of the day, Bob Rudis used R to create this map of worldwide piracy incidents from 2013 to 2017. The post provides a useful and practical example of extracting data from a website without an API, otherwise known as "scraping"...

Read more »

Exploratory Data Analysis of Tropical Storms in R

September 21, 2017
By
Exploratory Data Analysis of Tropical Storms in R

Exploratory Data Analysis of Tropical Storms in R The disastrous impact of recent hurricanes, Harvey and Irma, generated a large influx of data within the online community. I was curious about the history of hurricanes and tropical storms so I found a data set on data.world and started some basic Exploratory data analysis (EDA). EDA

Read more »

Gold-Mining – Week 3 (2017)

September 21, 2017
By

Week 3 Gold Mining and Fantasy Football Projection Roundup now available. Go get that free agent gold! The post Gold-Mining – Week 3 (2017) appeared first on Fantasy Football Analytics.

Read more »

Search R-bloggers

Sponsors

Mango solutions







Zero Inflated Models and Generalized Linear Mixed Models with R



Quantide: statistical consulting and training

ODSC2 west

ODSC1

datasociety

http://www.eoda.de





CRC R books series







Six Sigma Online Training



mljar.com



Contact us if you wish to help support R-bloggers, and place your banner here.