Function basis and regression

March 1, 2020
By
Function basis and regression

In the first part of the course on linear models, we’ve seen how to construct a linear model when the vector of covariates is given, so that is either simply (for standard linear models) or a functional of (in GLMs). But more generally, we can consider transformations of the covariates, so that a linear model can be used. In...

Read more »

The probabilities implied by bookmaker odds: Introducing the ‘implied’ package

March 1, 2020
By

My package for converting bookmaker odds into probabilities is now on available from CRAN. The package contains several different conversion algorithms, which are all accessible via the implied_probabilities() function. I have written an introduction on how you can use the … Continue reading →

Read more »

R tips and tricks – Paste a plot from R to a word file

March 1, 2020
By
R tips and tricks – Paste a plot from R to a word file

In this post you will learn how to properly paste an R plot\chart\image to a word file. There are few typical problems that occur when people try to do that. Below you can find a simple, clean and repeatable solution. When you google how to paste a plot from R to a word file you... Related posts: R tips and tricks...

Read more »

Using R: 10 years with R

March 1, 2020
By

Yesterday, 29 Feburary 2020, was the 20th anniversary of the release R 1.0.0. Jozef Hajnala’s blog has a cute anniversary post with some trivia. I realised that it is also (not to the day, but to the year) my R anniversary. Today is the 20th anniversary of the release of R 1.0.0. pic.twitter.com/gwItCBGYV4 — The

Read more »

SR2 Chapter 2 Hard

February 29, 2020
By

SR2 Chapter 2 Hard Posted on 1 March, 2020 by Brian Tags: statistical rethinking, solutions, conditional probability, counting, bayes rule, pandas Category: statistical-rethinking-2 Here’s my solution to the hard exercises in chapter...

Read more »

Predicting the misclassification cost incurred in air pressure system failure in heavy vehicles

February 29, 2020
By
Predicting the misclassification cost incurred in air pressure system failure in heavy vehicles

Abstract The Air Pressure System (APS) is a type of function used in heavy vehicles to assist braking and gear changing. The APS failure dataset consists of the daily operational sensor data from failed Scania trucks. The dataset is crucial to the man...

Read more »

Source code chapter of ‘evidence-based software engineering’ reworked

February 29, 2020
By

The Source code chapter of my evidence-based software engineering book has been reworked (draft pdf). When writing the first version of this chapter, I was not certain whether source code was a topic warranting a chapter to itself, in an evidence-based software engineering book. Now I am certain. Source code is the primary product delivery,

Read more »

Log transform or log link? And confounding variables. by @ellis2013nz

February 29, 2020
By
Log transform or log link? And confounding variables. by @ellis2013nz

Last week I wrote about the relationship between weight and height in US adults, as seen in the US Centers for Disease Control and prevention (CDC) Behavioral Risk Factor Surveillance System, an annual telephone survey of around 400,000 interviews per year. In particular, I tested the widely-circulated claim that Body Mass Index (BMI) exaggerates the “fatness” of tall people...

Read more »

matricks 0.8.2 available on CRAN

February 28, 2020
By
matricks 0.8.2 available on CRAN

matricks package in 0.8.2 version has been released on CRAN! In this post I will present you, what are advantages of using matricks and how you can use it. Creating matrices The main function the package started with is m. It’s a smart shortcut fo...

Read more »

Drawdowns by the data

February 28, 2020
By
Drawdowns by the data

We’re taking a break from our series on portfolio construction for two reasons: life and the recent market sell-off. Life got in the way of focusing on the next couple of posts on rebalancing. And given the market sell-off we were too busy gamma hedging our convexity exposure, looking for cheap tail risk plays, and trying to figure out...

Read more »

SR2 Chapter 2 Medium

February 28, 2020
By
SR2 Chapter 2 Medium

SR2 Chapter 2 Medium Posted on 29 February, 2020 by Brian Tags: statistical rethinking, solutions, conditional probability, counting, grid approximation Category: statistical-rethinking-2 Here’s my solutions to the medium exercises in chapter 2 of...

Read more »

What to know before you adopt Hugo/blogdown

Fancy (re-)creating your website using Hugo, with or without blogdown? Feeling a bit anxious? This post is aimed at being the Hugo equivalent of “What to know before you adopt a pet”. We shall go through things that can/will break in the future, and what you can do to prevent future pain. I’m writing this post with R users in mind, which means...

Read more »

The significance of the sector on the salary in Sweden, a comparison between different occupational groups, part 3

February 28, 2020
By
The significance of the sector on the salary in Sweden, a comparison between different occupational groups, part 3

To complete the analysis on the significance of the sector on the salary for different occupational groups in Sweden I will in this post examine the correlation between salary and sector using statistics for education. The F-value from the Anova table is used as the single value to discriminate how much the region and salary correlates. For exploratory analysis, the...

Read more »

How to Acquire Large Satellite Image Datasets for Machine Learning Projects

February 28, 2020
By
How to Acquire Large Satellite Image Datasets for Machine Learning Projects

Introduction Historically, only governments and large corporations have had access to quality satellite images. In recent years, satellite image datasets have become available to anyone with a computer and an internet connection. The quality, quantity, and precision of these datasets is continuously improving, and there are many free and commercial platforms at your disposal to Article How to Acquire...

Read more »

All you need to know on PCA …

February 28, 2020
By
All you need to know on PCA …

All you need to do with PCA is in Factoshiny! PCA – Principal Component Analysis – is a well known method for exploring and visualizing data. The function Factoshiny of the package Factoshiny allows you to perform PCA in a really easy way. You can include extras information such as categorical variables, manage missing data,

Read more »

Machine Learning with R: A Hands-on Introduction from Robert Muenchen at Machine Learning Week, Las Vegas

February 28, 2020
By
Machine Learning with R: A Hands-on Introduction from Robert Muenchen at Machine Learning Week, Las Vegas

Join Robert Muenchen’s workshop about Machine Learning with R at Machine Learning Week on May 31 – June 4, 2020 in Las Vegas!  Workshop Description  The Workshop will take place in May 31, 2020.  R offers a wide variety of machine learning (ML) functions, each of which works in a slightly different way. This one-day, … Continue reading Machine...

Read more »

XGBoostLSS – An extension of XGBoost to probabilistic forecasting

February 28, 2020
By
XGBoostLSS – An extension of XGBoost to probabilistic forecasting

Introduction  To reason rigorously under uncertainty we need to invoke the language of  probability (Zhang et al. 2020). Any model that falls short of providing quantification of the uncertainty attached to its outcome is likely to yield an incomplete and potentially misleading picture. While this is an irrevocable consensus in statistics, a common misconception, albeit a … Continue reading XGBoostLSS...

Read more »

Convolutional Neural Network under the Hood

February 27, 2020
By
Convolutional Neural Network under the Hood

Neural networks have really taken over for solving image recognition and high sample rate data problems in the last couple of years. In all honesty, I promise I won’t be teaching you what neural networks are or CNN’s are. There are hundred’s of resources that are published everyday explaining them. I’ll post few links below.... Continue Reading →

Read more »

Building A base dplyr With Primitives: Grouped Operations, Pipes and More!

February 27, 2020
By

Introduction In my last post we looked at how we can recreate base equivalents of the dplyr functions select(), filter(), mutate() and arrange(), amongst others. I wrote these functions and presented them in a new package called poorman. In this post I will be discussing new functionality that I have since added to poorman including grouped operations, renaming columns, summarising...

Read more »

Decision Boundary for a Series of Machine Learning Models

Decision Boundary for a Series of Machine Learning Models

Machine Learning at the Boundary: There is nothing new in the fact that machine learning models can outperform traditional econometric models but I want to show as part of my research why and how some models make given predictions or in this instance classifications. I wanted to show the decision boundary in which my binary classification model was making. That is,...

Read more »

Version 0.4.0 of nnetsauce, with fruits and breast cancer classification

February 27, 2020
By
Version 0.4.0 of nnetsauce, with fruits and breast cancer classification

Version 0.4.0 of nnetsauce, with fruits and breast cancer classification

Read more »

Student’s t-test in R and by hand: how to compare two groups under different scenarios

February 27, 2020
By
Student’s t-test in R and by hand: how to compare two groups under different scenarios

Introduction Null and alternative hypothesis Hypothesis testing Different versions of the Student’s t-test How to compute Student’s t-test by hand? Scenario 1: Independent samples with 2 known variances Scenario 2: Independent samples with 2 equal but unknown variances Scenario 3: Independent samples with 2 unequal and unknown variances Scenario 4: Paired samples where the variance of the differences is known Scenario 5: Paired samples where the variance of...

Read more »

Data Science in Manufacturing: An Overview

February 27, 2020
By
Data Science in Manufacturing: An Overview

Original article published in opendatascience.com In the last couple of years, data science has seen an immense influx in various industrial applications across the board. Today, we can see data science applied in health care, customer service, governments, cyber security, mechanical, aerospace, and other industrial applications. Among these, manufacturing has gained more prominence to achieve... Continue Reading →

Read more »

Developing a complex R Shiny app – the good, the bad and the ugly

February 27, 2020
By
Developing a complex R Shiny app – the good, the bad and the ugly

Together with Clara Bicalho (UC Berkeley) and Sisi Huang (WZB), I recently developed a web application that acts as a … Read More →

Read more »

MLOPS for R with Azure Machine Learning

February 26, 2020
By
MLOPS for R with Azure Machine Learning

The video recording of my RStudio::conf talk, MLOPS for R with Azure Machine Learning, is now available for streaming thanks to the fine folks at RStudio. The talk begins with a general discussion of MLOps (Machine Learning Operations) and how it differs from DevOps as applied to traditional (non-ML-based) applications. This is a theme I plan to develop further...

Read more »

RStudio Package Manager 1.1.2 – Windows

February 26, 2020
By
RStudio Package Manager 1.1.2 – Windows

RStudio Package Manager 1.1.2 introduces beta support for Windows package binaries. These binaries make it easier and faster to install R packages on Windows Desktop. With this release, all the benefits of Package Manager are available to ...

Read more »

if … else and ifelse

February 26, 2020
By

Let’s make this a quick and quite basic one. There is this incredibly useful function in R called ifelse(). It’s basically a vectorized version of an if … else control structure every programming language has in one way or the other. ifelse() has, in my view, two major advantages over if … else: It’s super fast. It’s more convenient to use. The...

Read more »

chain of lynx and drove of hares

February 26, 2020
By
chain of lynx and drove of hares

A paper (and an introduction to the paper) in Nature this week seems to have made progress on the existence of indefinite predator-prey cyles. As in the lynx/hare dataset available on R. The paper is focusing on another pair, an invertebrate and its prey, an algae. For which the authors managed a 50 cycle sequence.

Read more »

A New Baby Boom Poster

February 26, 2020
By
A New Baby Boom Poster

I wanted to work through a few examples of more polished graphics done mostly but perhaps not entirely in R. So, I revisited the Baby Boom visualizations I made a while ago and made a new poster with them. This allowed me to play around with a few packages that I either hadn’t made use of or that weren’t...

Read more »

Search R-bloggers

Sponsors