Basics of data.table: Smooth data exploration

August 23, 2017
By
Basics of data.table: Smooth data exploration

The data.table package provides perhaps the fastest way for data wrangling in R. The syntax is concise and is made to resemble SQL. After studying the basics of data.table and finishing this exercise set successfully you will be able to start easing into using data.table for all your data manipulation needs. We will use data Related exercise sets: Vector exercises...

Read more »

Going Bayes #rstats

August 23, 2017
By
Going Bayes #rstats

Some time ago I started working with Bayesian methods, using the great rstanarm-package. Beside the fantastic package-vignettes, and books like Statistical Rethinking or Doing Bayesion Data Analysis, I also...

Read more »

Rcpp now used by 10 percent of CRAN packages

August 23, 2017
By
Rcpp now used by 10 percent of CRAN packages

Over the last few days, Rcpp passed another noteworthy hurdle. It is now used by over 10 percent of packages on CRAN (as measured by Depends, Imports and...

Read more »

Simple practice: data wrangling the iris dataset

August 23, 2017
By

If you want to work on large data science projects (analyses and machine learning) you need to be able to perform dozens of small tasks ... For example, you'll need...

Read more »

useR!2017 Roundup

August 23, 2017
By
useR!2017 Roundup

Organising useR!2017 was a challenge but a very rewarding experience. With about 1200 attendees of over 55 nationalities exploring an interesting program, we believe it is appropriate to call...

Read more »

Gender roles in film direction, analyzed with R

August 22, 2017
By
Gender roles in film direction, analyzed with R

What do women do in films? If you analyze the stage directions in film scripts — as Julia Silge, Russell Goldenberg and Amber Thomas have done for this visual...

Read more »

Caching httr Requests? This means WAR[C]!

August 22, 2017
By

I’ve blathered about my crawl_delay project before and am just waiting for a rainy weekend to be able to crank out a follow-up post on it. Working on that...

Read more »

Some Neat New R Notations

August 22, 2017
By
Some Neat New R Notations

The R package seplyr supplies a few neat new coding notations. An Abacus, which gives us the term “calculus.” The first notation is an operator called the “named map...

Read more »

So you (don’t) think you can review a package

August 22, 2017
By
So you (don’t) think you can review a package

Contributing to an open-source community without contributing code is an oft-vaunted idea that can seem nebulous. Luckily, putting vague ideas into action is one of the strengths of the...

Read more »

Onboarding visdat, a tool for preliminary visualisation of whole dataframes

August 22, 2017
By
Onboarding visdat, a tool for preliminary visualisation of whole dataframes

Take a look at the data This is a phrase that comes up when you first get a dataset. It is also ambiguous. Does it mean to do some exploratory modelling?...

Read more »

How to Create an Online Choice Simulator

August 21, 2017
By
How to Create an Online Choice Simulator

What is a choice simulator? A choice simulator is an online app or an Excel workbook that allows users to specify different scenarios and get predictions. Here is an example of...

Read more »

Introducing routr – Routing of HTTP and WebSocket in R

August 21, 2017
By
Introducing routr – Routing of HTTP and WebSocket in R

routr is now available on CRAN, and I couldn’t be happier. It’s release marks the completion of an idea that stretches back longer than my attempts to bring network visualization and...

Read more »

Understanding gender roles in movies with text mining

August 21, 2017
By
Understanding gender roles in movies with text mining

I have a new visual essay up at The Pudding today, using text mining to explore how women are portrayed in film. The R code behind this analysis in publicly...

Read more »

Tidyer BLS data with the blscarpeR package

August 21, 2017
By
Tidyer BLS data with the blscarpeR package

The recent release of the blscrapeR package brings the “tidyverse” into the fold. Inspired by my recent collaboration with Kyle Walker on his excellent tidycensus package, blscrapeR has been...

Read more »

Learning things we already know about stocks

August 21, 2017
By
Learning things we already know about stocks

This example groups stocks together in a network that highlights associations within and between the groups using only historical price data. The result is far from ground-breaking; you can...

Read more »

Using regression trees for forecasting double-seasonal time series with trend in R

August 21, 2017
By
Using regression trees for forecasting double-seasonal time series with trend in R

After blogging break caused by writing research papers, I managed to secure time to write something new about time series forecasting. This time I want to share with you...

Read more »

Simply Mapping

August 21, 2017
By
Simply Mapping

Give me fuel, give me fire, reduced deprivation's my desire - First attempts with simple features The latest edition of the Scottish Index of Multiple Deprivation...

Read more »

Free simmer hexagon stickers!

August 21, 2017
By
Free simmer hexagon stickers!

Do you want to get your own simmer hexagon sticker? Just fill in this form and get one send to you for free. Check out r-simmer.org or CRAN for more...

Read more »

Highlights of the Data Science Track at Microsoft Ignite

August 21, 2017
By

I will be at the AI Summit in San Francisco next month, which means I can't make it to Ignite in Orlando this year. Which is a bit of...

Read more »

Bayesian A/B Testing Made Easy

August 21, 2017
By
Bayesian A/B Testing Made Easy

A/B Testing is a familiar task for many working in business analytics. Essentially, A/B Testing is a simple form of hypothesis testing with one control group and one treatment...

Read more »

Compare Tube Types with R – Repeated Measures ANOVA

August 21, 2017
By
Compare Tube Types with R – Repeated Measures ANOVA

Background Sometimes we might want to compare three or four tube types for a particular analyte on a group of patients or we might want to see if a...

Read more »

Computer Vision Algorithms for R users

August 21, 2017
By

Just before the summer holidays, BNOSAC presented a talk called Computer Vision and Image Recognition algorithms for R users at the UseR conference. In the talk 6 packages on...

Read more »

Be careful not to control for a post-exposure covariate

August 20, 2017
By
Be careful not to control for a post-exposure covariate

A researcher was presenting an analysis of the impact various types of childhood trauma might have on subsequent substance abuse in adulthood. Obviously, a very interesting and challenging research...

Read more »

Transfer Learning with augmented Data for Logo Detection

August 20, 2017
By
Transfer Learning with augmented Data for Logo Detection

The last months, I have worked on brand logo detection in R with Keras. Starting with a model from scratch adding more data and using a...

Read more »

DART: Dropout Regularization in Boosting Ensembles

August 20, 2017
By
DART: Dropout Regularization in Boosting Ensembles

The dropout approach developed by Hinton has been widely employed in deep learnings to prevent the deep neural network from overfitting, as shown in https://statcompute.wordpress.com/2017/01/02/dropout-regularization-in-deep-neural-networks. In the paper http://proceedings.mlr.press/v38/korlakaivinayak15.pdf,...

Read more »

Model Operational Losses with Copula Regression

August 20, 2017
By
Model Operational Losses with Copula Regression

In the previous post (https://statcompute.wordpress.com/2017/06/29/model-operational-loss-directly-with-tweedie-glm), it has been explained why we should consider modeling operational losses for non-material UoMs directly with Tweedie models. However, for material UoMs with significant...

Read more »

Wrong on an Astronomical Scale

August 20, 2017
By
Wrong on an Astronomical Scale

I recently posted an update regarding our R package revisit, aimed at partially remedying the reproducibility crisis, both in the sense of (a) providing transparency to data analyses and...

Read more »

RcppArmadillo 0.7.960.1.1

August 20, 2017
By
RcppArmadillo 0.7.960.1.1

On the heels of the very recent bi-monthly RcppArmadillo release comes a quick bug-fix release 0.7.960.1.1 which just got onto CRAN (and I will ship a build to...

Read more »

Answer probability questions with simulation

August 20, 2017
By
Answer probability questions with simulation

Probability is at the heart of data science. Simulation is also commonly used in algorithms such as the bootstrap. After completing this exercise, you will have a slightly stronger intuition...

Read more »

Search R-bloggers

Sponsors

Mango solutions





Zero Inflated Models and Generalized Linear Mixed Models with R

r-brain.io



Quantide: statistical consulting and training

ODSC1

ODSC2

datasociety

http://www.eoda.de





CRC R books series







Six Sigma Online Training



omictools

statcon.de

Contact us if you wish to help support R-bloggers, and place your banner here.