Using Gradient Boosted Machine to Predict MPG for 2019 Vehicles

June 18, 2019
By
Using Gradient Boosted Machine to Predict MPG for 2019 Vehicles

Continuing on the below post, I am going to use a gradient boosted machine model to predict combined miles per gallon for all 2019 motor vehicles. Part 1: Using Decision Trees and Random Forest to Predict MPG for 2019 Vehicles The raw data is located on the EPA government siteThe variables/features I am using for the models are: Engine...

Read more »

Cohen’s D for Experimental Planning

June 18, 2019
By
Cohen’s D for Experimental Planning

In this note, we discuss the use of Cohen’s D for planning difference-of-mean experiments. Estimating sample size Let’s imagine you are testing a new weight loss program and comparing...

Read more »

Quick hit: Some ggplot2 Stat 💙 for {logspline}

June 18, 2019
By

I’ve become a big fan of the {logspline} package over the past ~6 months and decided to wrap up a manual ggplot2 plotting process (well, it was at least...

Read more »

New Versions of R GUIs: BlueSky, JASP, jamovi

June 18, 2019
By
New Versions of R GUIs: BlueSky, JASP, jamovi

It has been only two months since I summarized my reviews of point-and-click front ends for R, and it’s already out of date! I have converted that post into...

Read more »

How to Perform Ordinal Logistic Regression in R

June 18, 2019
By
How to Perform Ordinal Logistic Regression in R

In this article, we discuss the basics of ordinal logistic regression and its implementation in R. Ordinal logistic regression is a widely used classification method, with applications in variety...

Read more »

anytime 0.3.4

June 18, 2019
By

A new minor release of the anytime package is arriving on CRAN. This is the fifteenth release, and first since the 0.3.3 release in November. anytime is a very...

Read more »

Understanding AdaBoost – or how to turn Weakness into Strength

June 18, 2019
By
Understanding AdaBoost – or how to turn Weakness into Strength

Many of you might have heard of the concept “Wisdom of the Crowd”: when many people independently guess some quantity, e.g. the number of marbles in a jar glass,...

Read more »

radian: a modern console for R

June 18, 2019
By
radian: a modern console for R

Whenever I’m developing R code or writing data wrangling or analysis scripts for research projects that I work on I use Emacs and its add-on package Emacs Speaks Statistics...

Read more »

Parametric survival modeling

June 17, 2019
By
Parametric survival modeling

Introduction Survival distributions Shapes of hazard functions Exponential distribution Weibull distribution (AFT) Weibull distribution...

Read more »

Getting from flat data a world of relationships to visualise with Gephi

June 17, 2019
By
Getting from flat data a world of relationships to visualise with Gephi

by Mariluz Congosto Network analysis offers a perspective of the data that broadens and enriches any investigation. Many times we deal with...

Read more »

Visualizing the Copa América: Historical Records, Squad Profiles, and Player Profiles with xG statistics!

June 17, 2019
By
Visualizing the Copa América: Historical Records, Squad Profiles, and Player Profiles with xG statistics!

Another summer and another edition of the Copa América!...

Read more »

Le Monde puzzle [#1104]

June 17, 2019
By
Le Monde puzzle [#1104]

A palindromic Le Monde mathematical puzzle: In a monetary system where all palindromic amounts between 1 and 10⁸ have a coin, find the numbers less than 10³ that cannot...

Read more »

On my way to Manizales (Colombia)

June 16, 2019
By
On my way to Manizales (Colombia)

Next week, I will be in Manizales, Colombia, for the Third International Congress on Actuarial Science and Quantitative Finance. I will be giving a lecture on Wednesday with Jed...

Read more »

Forecasting tools in development

June 16, 2019
By
Forecasting tools in development

As I’ve been writing up a progress report for my NIGMS R35 MIRA award, I’ve been reminded at how much of the work that we’ve been doing is focused...

Read more »

modelDown is now on CRAN!

June 16, 2019
By
modelDown is now on CRAN!

The modelDown package turns classification or regression models into HTML static websites. With one command you can convert one or more models into a website with visual and tabular...

Read more »

‘Simulating genetic data with R: an example with deleterious variants (and a pun)’

June 16, 2019
By
‘Simulating genetic data with R: an example with deleterious variants (and a pun)’

A few weeks ago, I gave a talk at the Edinburgh R users group EdinbR on the RAGE paper. Since this is an R meetup, the talk concentrated on...

Read more »

Introducing the {ethercalc} package

June 15, 2019
By
Introducing the {ethercalc} package

I mentioned EtherCalc in a previous post and managed to scrounge some time to put together a fledgling {ethercalc} package (it’s also on GitLab, SourceHut, Bitbucket and GitUgh, just...

Read more »

Exploring Categorical Data With Inspectdf

June 14, 2019
By
Exploring Categorical Data With Inspectdf

Exploring categorical data with inspectdf

Read more »

Stabilising transformations: how do I present my results?

Stabilising transformations: how do I present my results?

ANOVA is routinely used in applied biology for data analyses, although, in some instances, the basic assumptions of normality and homoscedasticity of residuals do not hold. In those instances,...

Read more »

Fun with R and the Noops

June 14, 2019
By
Fun with R and the Noops

Earlier this week, Github introduced Noops, a collection of simple black-box machines with API endpoints, with the goal of challenging developers of all skill levels to solve problems with...

Read more »

EARL London keynote announcement: Helen Hunter, Sainsbury’s

June 14, 2019
By

We are delighted to announce that Helen Hunter, Chief Data Officer at Sainsbury’s will deliver the opening keynote address at this year’s London EARL conference. As Chief Data Officer...

Read more »

Periodogram with R

June 13, 2019
By
Periodogram with R

Periodogram with R The power spectral density (PSD) is a function that describes the distribution of power over the frequency components composing our data set. If we knew the...

Read more »

Fixing your mistakes: sentiment analysis edition

June 13, 2019
By
Fixing your mistakes: sentiment analysis edition

Today tidytext 0.2.1 is available on CRAN! This new release of tidytext has a collection of nice new features. Bug squashing! 🐛 Improvements to error messages and documentation 📃 Switching from broom...

Read more »

#rstats adventures in the land of @rstudio shiny (apps)

June 13, 2019
By
#rstats adventures in the land of @rstudio shiny (apps)

PreambleColleagues and I had some sweet telemetry data, we did some simple models (& some relatively more complex ones too), we drew maps, and we wrote a paper. However,...

Read more »

Polygon plotting in R

June 13, 2019
By
Polygon plotting in R

As a data analyst you want to provide clear cut insights for your end users, enabling them to extract all the business value provided by your solution. If your...

Read more »

R vs. Python

June 13, 2019
By

For some time, I’ve planned to write up a point-by-point comparison of R and Python. I’ve done so now! Comments welcome. Advertisements

Read more »

Equal Size kmeans

June 12, 2019
By
Equal Size kmeans

We were recently presented with a problem where the decision maker wanted to understand how their data would naturally group together. The classic technique of k-means clustering was a...

Read more »

RStudio Connect 1.7.4.2 – Important Security Patch

June 12, 2019
By

This RStudio Connect patch release addresses an urgent security update and an important bug fix. Security Update: Password Authentication A vulnerability has been identified for customers using RStudio Connect’s built-in password...

Read more »

Community Call – Involving Multilingual Communities

Community Call – Involving Multilingual Communities

rOpenSci’s community is increasingly international and multilingual. While we have operated primarily in...

Read more »

Search R-bloggers

Sponsors