Articles by R on Jorge Cimentada

Maximum Likelihood Distilled

November 25, 2020 | R on Jorge Cimentada

We all hear about Maximum Likelihood Estimation (MLE) and we often see hints of it in our model output. As usual, doing things manually can give a better grasp on how to better understand how our models work. Here’s a very short example impl... [Read more...]

The simplest tidy machine learning workflow

February 5, 2020 | R on Jorge Cimentada

caret is a magical package for doing machine learning in R. Look at this code for running a regularized regression:
library(caret)

inTrain <- createDataPartition(y = mtcars$mpg,
                               p = 0.75,
                               list = FALSE)  

reg_mod <- train(
  mpg ~ .,
  data = mtcars[inTrain, ],
  method = "glmnet",
  tuneLength = 10,
  preProc = c("center", "scale"),
  trControl = trainControl(method = "cv", number = 10)
)
The two function calls in the expression above perform these operations: Create a training set containing a random sample of 75% of the initial sample Center and scale all predictors ... [Read more...]

essurvey release

November 15, 2019 | R on Jorge Cimentada

The new essurvey 1.0.3 is here! This release is mainly about downloading weight data from the European Social Survey (ESS), which has been on the works since 2017! As usual, you can install from CRAN or Github with:
# From CRAN
install.packages("essurvey")

# or development version from Github
devtools::install_github("ropensci/essurvey")

# and load
library(essurvey)
set_email("[email protected]")
Remember to set your registered email with set_email to download ESS data. ... [Read more...]

Saving missing categories from R to Stata

March 15, 2019 | R on Jorge Cimentada

I’m finishing a project from the RECSM institute where we developed a Shiny application to download data from the European Social Survey with Spanish translated labels. This was one hell of a project since I had to build some wrappers around the Google Translate API to generate translations for ...
[Read more...]

Why does R drop attributes when subsetting?

March 15, 2019 | R on Jorge Cimentada

I had to spend about 1 hour yesterday because R did something completely unpredictable (for my taste). It dropped an attribute without a warning.
df <- data.frame(x = rep(c(1, 2), 20))

attr(df$x, "label") <- "This is clearly a label"

df$x
##  [1] 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1
## [36] 2 1 2 1 2
## attr(,"label")
## [1] "This is clearly a label"
The label is clearly there. To my surprise, if I subset this data frame, R drops the attribute.
new_df <- df[df$x == 2, , drop = FALSE]

new_df$x
##  [1] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
It doesn’t matter if ... [Read more...]

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)