Articles by Andrew Treadway

Faster data exploration with DataExplorer

March 2, 2021 | Andrew Treadway

Data exploration is an important part of the modeling process. It can also take up a fair amount of time. The awesome DataExplorer package in R aims to make this process easier. To get started with DataExplorer, you’ll need to install it like below: Let’s use DataExplorer to ...
[Read more...]

How to solve Sudoku with R

November 25, 2020 | Andrew Treadway

In this post we discuss how to write an R script to solve any Sudoku puzzle. There are some R packages to handle this, but in our case, we’ll write our own solution. For our purposes, we’ll assume the input Sudoku is a 9×9 grid. At the end result, ...
[Read more...]

Why you should use vapply in R

October 19, 2020 | Andrew Treadway

In this post we’ll cover the vapply function in R. vapply is generally lesser known than the more popular sapply, lapply, and apply functions. However, it is very useful when you know what data type you’re expecting to apply a function to as it helps to prevent silent ...
[Read more...]

How to create an API for your R code

August 17, 2020 | Andrew Treadway

In the video linked below we discuss how to convert your R code into an API using the awesome plumber package! Learn more by clicking here or by following the links below. The plumber package allows you to convert R functions into API calls. For example, rather than launching R ...
[Read more...]

How to create PowerPoint reports with R

July 27, 2020 | Andrew Treadway

In my last post, we discussed how to create and read Word files with R’s officer package. This article will expand on officer by showing how we can use it to create PowerPoint reports. Getting started Let’s get started by loading officer. Next, we’ll create a PowerPoint ...
[Read more...]

How to read and create Word Documents in R

July 21, 2020 | Andrew Treadway

Reading and creating word documents in R In this post we’ll talk about how to use R to read and create word files. We’ll primarily be using R’s officer package. For reading data from Word Documents with Python, click here. Creating Word reports with the officer package ...
[Read more...]

How to schedule R scripts

May 11, 2020 | Andrew Treadway

Running R with taskscheduleR and cronR In a previous post, we talked about how to run R from the Windows Task Scheduler. This article will talk about two additional approaches to schedule R scripts, including using the taskscheduleR package on Windows and the cronR package for Linux. For scheduling Python ...
[Read more...]

Make your Amazon purchases with R!

April 20, 2020 | Andrew Treadway

Background Anyone who’s bought groceries online recently has seen the huge increase in demand due to the COVID-19 outbreak and quarantines. In this post, you’ll learn how to buy groceries on Amazon using R! To do that, we’ll be using the RSelenium package. In case you’re ...
[Read more...]

What to study if you’re under quarantine

March 23, 2020 | Andrew Treadway

If you’re staying indoors more often recently because of the current COVID-19 outbreak and looking for new things to study, here’s a few ideas! Free 365 Data Science Courses 365 Data Science is making all of their courses free until April 15. They have a variety of courses across R, Python, ...
[Read more...]

How to create decorators in R

March 16, 2020 | Andrew Treadway

Introduction One of the coolest features of Python is its nice ability to create decorators. In short, decorators allow us to modify how a function behaves without changing the function’s source code. This can often make code cleaner and easier to modify. For instance, decorators are also really useful ...
[Read more...]

3 recommended books on learning R

February 24, 2020 | Andrew Treadway

I sometimes get asked how I got started learning R. I thought I would use this post to go through a few books I read along the way which have been highly useful. The Art of R Programming The Art of R Programming: A Tour of Statistical Software Design is ...
[Read more...]

How is information gain calculated?

February 17, 2020 | Andrew Treadway

This post will explore the mathematics behind information gain. We’ll start with the base intuition behind information gain, but then explain why it has the calculation that it does. What is information gain? Information gain is a measure frequently used in decision trees to determine which variable to split ...
[Read more...]

Evaluate your R model with MLmetrics

January 28, 2020 | Andrew Treadway

This post will explore using R’s MLmetrics to evaluate machine learning models. MLmetrics provides several functions to calculate common metrics for ML models, including AUC, precision, recall, accuracy, etc. Building an example model Firstly, we need to build a model to use as an example. For this post, we’...
[Read more...]

How to import Python classes into R

January 13, 2020 | Andrew Treadway

Background This post is going to talk about how to import Python classes into R, which can be done using a really awesome package in R called reticulate. reticulate allows you to call Python code from R, including sourcing Python scripts, using Python packages, and porting functions and classes. To ...
[Read more...]

mapply and Map in R

December 29, 2019 | Andrew Treadway

An older post on this blog talked about several alternative base apply functions. This post will talk about how to apply a function across multiple vectors or lists with Map and mapply in R. These functions are generalizations of sapply and lapply, which allow you to more easily loop over ...
[Read more...]

How to get an AUC confidence interval

August 19, 2019 | Andrew Treadway

Background AUC is an important metric in machine learning for classification. It is often used as a measure of a model’s performance. In effect, AUC is a measure between 0 and 1 of a model’s performance that rank-orders predictions from a model. For a detailed explanation of AUC, see this ...
[Read more...]

Really large numbers in R

August 15, 2019 | Andrew Treadway

This post will discuss ways of handling huge numbers in R using the gmp package. The gmp package The gmp package provides us a way of dealing with really large numbers in R. For example, let’s suppose we want to multiple 10250 by itself. Mathematically we know the result should ...
[Read more...]

BeautifulSoup vs. Rvest

July 22, 2019 | Andrew Treadway

This post will compare Python’s BeautifulSoup package to R’s rvest package for web scraping. We’ll also talk about additional functionality in rvest (that doesn’t exist in BeautifulSoup) in comparison to a couple of other Python packages (including pandas and RoboBrowser). Getting started BeautifulSoup and rvest both ...
[Read more...]

Testing the Collatz Conjecture with R

July 12, 2019 | Andrew Treadway

Background The Collatz Conjecture is a famous unsolved problem in number theory. If you’re not familiar with it – the conjecture is very simple to understand, yet, no one has been able to mathematically prove that the conjecture is true (though it’s been shown to be true for an ...
[Read more...]

How to hide a password in R with the keyring package

June 25, 2019 | Andrew Treadway

This post will introduce using the keyring package to hide a password. Short background The keyring package is a library designed to let you access your operating system’s credential store. In essence, it lets you store and retrieve passwords in your operating system, which allows you to avoid having ...
[Read more...]
1 2 3

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)