Articles by kjytay

What is nearly-isotonic regression?

May 26, 2020 | kjytay

Let’s say we have data such that . (We assume no ties among the ‘s for simplicity.) Isotonic regression gives us a monotonic fit for the ‘s by solving the problem (See this previous post for more details.) Nearly-isotonic regression, … Continue reading →
[Read more...]

What is isotonic regression?

May 24, 2020 | kjytay

Isotonic regression is a method for obtaining a monotonic fit for 1-dimensional data. Let’s say we have data such that . (We assume no ties among the ‘s for simplicity.) Informally, isotonic regression looks for such that the ‘s approximate … Continue reading →
[Read more...]

glmnet v4.0: generalizing the family parameter

May 14, 2020 | kjytay

I’ve had the privilege of working with Trevor Hastie on an extension of the glmnet package which has just been released. In essence, the glmnet() function’s family parameter can now be any object of class family. This enables the user … Continue reading →
[Read more...]

A deep dive into glmnet: predict.glmnet

March 27, 2020 | kjytay

I’m writing a series of posts on various function options of the glmnet function (from the package of the same name), hoping to give more detail and insight beyond R’s documentation. In this post, instead of looking at one of … Continue reading →
[Read more...]

rOpenSci community calls

March 24, 2020 | kjytay

This is a short PSA about an R resource that I recently learnt about (and participated in): rOpenSci community calls. According to the website, these community calls happen quarterly, and is a place where the public can learn about “best … Continue reading → [Read more...]

Extended floating point precision in R with Rmpfr

March 18, 2020 | kjytay

I learnt from a recent post on John Cook’s excellent blog that it’s really easy to do extended floating point computations in R using the Rmpfr package. Rmpfr is R’s wrapper around the C library MPFR, which stands for “Multiple … Continue reading →
[Read more...]

A deep dive into glmnet: type.gaussian

March 13, 2020 | kjytay

I’m writing a series of posts on various function options of the glmnet function (from the package of the same name), hoping to give more detail and insight beyond R’s documentation. In this post, we will look at the type.gaussian … Continue reading →
[Read more...]

Generating correlation matrix for AR(1) model

February 7, 2020 | kjytay

Assume that we are in the time series data setting, where we have data at equally-spaced times which we denote by random variables . The AR(1) model, commonly used in econometrics, assumes that the correlation between and is , where is … Continue reading →
[Read more...]

Non-negative least squares

November 27, 2019 | kjytay

Imagine that one has a data matrix consisting of observations, each with features, as well as a response vector . We want to build a model for using the feature columns in . In ordinary least squares (OLS), one seeks … Continue reading →
[Read more...]

An unofficial vignette for the gamsel package

November 24, 2019 | kjytay

I’ve been working on a project/package that closely mirrors that of GAMSEL (generalized additive model selection), a method for fitting sparse generalized additive models (GAMs). In preparing my package, I realized that (i) the gamsel package which implements GAMSEL doesn’t have … Continue reading →
[Read more...]

Use mfcol to have plots drawn by column

October 7, 2019 | kjytay

To plot multiple figures on a single canvas in base R, we can change the graphical parameter mfrow. For instance, the code below tells R that subsequent figures will by drawn in a 2-by-3 array: If we then run this … Continue reading →
[Read more...]

Lesser known dplyr functions

August 30, 2019 | kjytay

The dplyr package is an essential tool for manipulating data in R. The “Introduction to dplyr” vignette gives a good overview of the common dplyr functions (list taken from the vignette itself): filter() to select cases based on their values. arrange() to … Continue reading → [Read more...]

Visualizing the relationship between multiple variables

August 24, 2019 | kjytay

Visualizing the relationship between multiple variables can get messy very quickly. This post is about how the ggpairs() function in the GGally package does this task, as well as my own method for visualizing pairwise relationships when all the variables … Continue reading →
[Read more...]

Changing the variable inside an R formula

August 23, 2019 | kjytay

I recently encountered a situation where I wanted to run several linear models, but where the response variables would depend on previous steps in the data analysis pipeline. Let me illustrate using the mtcars dataset: Let’s say I wanted to … Continue reading → [Read more...]
1 2 3 4

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)