Articles by kjytay

A deep dive into glmnet: predict.glmnet

March 27, 2020 | kjytay

I’m writing a series of posts on various function options of the glmnet function (from the package of the same name), hoping to give more detail and insight beyond R’s documentation. In this post, instead of looking at one of … Continue reading →
[Read more...]

rOpenSci community calls

March 24, 2020 | kjytay

This is a short PSA about an R resource that I recently learnt about (and participated in): rOpenSci community calls. According to the website, these community calls happen quarterly, and is a place where the public can learn about “best … Continue reading → [Read more...]

Extended floating point precision in R with Rmpfr

March 18, 2020 | kjytay

I learnt from a recent post on John Cook’s excellent blog that it’s really easy to do extended floating point computations in R using the Rmpfr package. Rmpfr is R’s wrapper around the C library MPFR, which stands for “Multiple … Continue reading →
[Read more...]

A deep dive into glmnet: type.gaussian

March 13, 2020 | kjytay

I’m writing a series of posts on various function options of the glmnet function (from the package of the same name), hoping to give more detail and insight beyond R’s documentation. In this post, we will look at the type.gaussian … Continue reading →
[Read more...]

Generating correlation matrix for AR(1) model

February 7, 2020 | kjytay

Assume that we are in the time series data setting, where we have data at equally-spaced times which we denote by random variables . The AR(1) model, commonly used in econometrics, assumes that the correlation between and is , where is … Continue reading →
[Read more...]

Non-negative least squares

November 27, 2019 | kjytay

Imagine that one has a data matrix consisting of observations, each with features, as well as a response vector . We want to build a model for using the feature columns in . In ordinary least squares (OLS), one seeks … Continue reading →
[Read more...]

An unofficial vignette for the gamsel package

November 24, 2019 | kjytay

I’ve been working on a project/package that closely mirrors that of GAMSEL (generalized additive model selection), a method for fitting sparse generalized additive models (GAMs). In preparing my package, I realized that (i) the gamsel package which implements GAMSEL doesn’t have … Continue reading →
[Read more...]

Use mfcol to have plots drawn by column

October 7, 2019 | kjytay

To plot multiple figures on a single canvas in base R, we can change the graphical parameter mfrow. For instance, the code below tells R that subsequent figures will by drawn in a 2-by-3 array: If we then run this … Continue reading →
[Read more...]

Lesser known dplyr functions

August 30, 2019 | kjytay

The dplyr package is an essential tool for manipulating data in R. The “Introduction to dplyr” vignette gives a good overview of the common dplyr functions (list taken from the vignette itself): filter() to select cases based on their values. arrange() to … Continue reading → [Read more...]

Visualizing the relationship between multiple variables

August 24, 2019 | kjytay

Visualizing the relationship between multiple variables can get messy very quickly. This post is about how the ggpairs() function in the GGally package does this task, as well as my own method for visualizing pairwise relationships when all the variables … Continue reading →
[Read more...]

Changing the variable inside an R formula

August 23, 2019 | kjytay

I recently encountered a situation where I wanted to run several linear models, but where the response variables would depend on previous steps in the data analysis pipeline. Let me illustrate using the mtcars dataset: Let’s say I wanted to … Continue reading → [Read more...]

Looking at flood insurance claims with choroplethr

July 14, 2019 | kjytay

I recently learned how to use the choroplethr package through a short tutorial by the package author Ari Lamstein (youtube link here). To cement what I learned, I thought I would use this package to visualize flood insurance claims. I … Continue reading →
[Read more...]

Sampling paths from a Gaussian process

July 7, 2019 | kjytay

Gaussian processes are a widely employed statistical tool because of their flexibility and computational tractability. (For instance, one recent area where Gaussian processes are used is in machine learning for hyperparameter optimization.) A stochastic process is a Gaussian process if … Continue reading →
[Read more...]

Probability of winning a best-of-7 series

April 22, 2019 | kjytay

The NBA playoffs are in full swing! A total of 16 teams are competing in a playoff-format competition, with the winner of each best-of-7 series moving on to the next round. In each matchup, two teams play 7 basketball games … Continue reading →
[Read more...]
1 2 3 4 5

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)