February 2020

“Clearing the Confusion” series

February 6, 2020 | matloff

In recent weeks, I’ve posted three tutorials with Clearing the Confusion titles, all in my regtools GitHub repo. Topics have been unbalanced classification data; k-fold cross validation; and scaling in PCA. Comments welcome! [Read more...]

Le Monde puzzle [#1130]

February 6, 2020 | xi'an

A two-player game as Le weekly Monde current mathematical puzzle: Abishag and Caleb fill in alternance a row of N boxes in a row by picking one then two then three &tc. consecutive boxes. When a player is unable to find enough consecutive boxes, the player has lost. Who is ...
[Read more...]

Function to download biotic interaction datasets

February 6, 2020 | Frederico Mestre

I work in ecology, biogeography, etc… Biotic interactions (interactions between species) and its repercussions on species distributions is my main research interest. As such, I had, at some point, to download datasets on species interactions. I wanted to be able to produce a uniform (more or less, not as much ... [Read more...]

Introduction to the forecastLM package

February 5, 2020 | Rami Krispin

I am pleased to announce a new R package - forecastLM. The package, as the name implies, provides applications for forecasting regular time series data with a linear regression model (based on the lm function from the stats package). It supports both ts and tsibble objects as inputs and enables ... [Read more...]

The simplest tidy machine learning workflow

February 5, 2020 | R on Jorge Cimentada

caret is a magical package for doing machine learning in R. Look at this code for running a regularized regression:
library(caret)

inTrain <- createDataPartition(y = mtcars$mpg,
                               p = 0.75,
                               list = FALSE)  

reg_mod <- train(
  mpg ~ .,
  data = mtcars[inTrain, ],
  method = "glmnet",
  tuneLength = 10,
  preProc = c("center", "scale"),
  trControl = trainControl(method = "cv", number = 10)
)
The two function calls in the expression above perform these operations: Create a training set containing a random sample of 75% of the initial sample Center and scale all predictors ... [Read more...]

Introduction to the forecastLM package

February 5, 2020 | Rami Krispin

I am pleased to announce a new R package - forecastLM. The package, as the name implies, provides applications for forecasting regular time series data with a linear regression model (based on the lm function from the stats package). It supports both ts and tsibble objects as inputs and enables ... [Read more...]

Shiny: Load testing and horizontal scaling

February 5, 2020 | eoda GmbH

„Money can’t buy you happiness, but it can buy you more EC2 Instances…“ – With this quote Sean Lopp, Product Manager at RStudio, PBC, rang in his „Scaling Shiny“ showcase. In this showcase, he uses a load-testing approach to show how a Shiny application can be scaled for 10,000 users. RStudio’...
[Read more...]

#TidyTuesday and tidymodels

February 4, 2020 | Rstats on Julia Silge

This week I started my new job as a software engineer at RStudio, working with Max Kuhn and other folks on tidymodels. I am really excited about tidymodels because my own experience as a practicing data scientist has shown me some of the areas for growth that still exist in ...
[Read more...]

Some 2020 R Conferences

February 4, 2020 | R Views

rstudio::conf kicked off the 2020 season for R conferences last week with record attendance somewhere north of twenty-one hundred. Session topics ranged from business to science, marketing to medicine and attracted R users with very varied backgrounds including DevOps professionals, data scientists, journalists, physicians, statisticians, R package developers, Shiny developers ...
[Read more...]

Consensus clustering in R

February 4, 2020 | chris2016

The logic behind the Monti consensus clustering algorithm is that in the face of resampling the ideal clusters should be stable, thus any pair of samples should either always or never cluster together. We can use this principle to infer the optimal number of clusters (K). This works by examining ...
[Read more...]

RStudio::conf 2020 San Francisco Recap

February 4, 2020 | Jordan Gray

RStudio::conf 2020 is a wrap! What a tremendous experience. It was quite a production to send four Appsilon team members to San Francisco, California from Warsaw with nearly 100 kg of swag, but it was absolutely worthwhile. We were a proud sponsor of the event, and having a booth set up ...
[Read more...]
1 10 11 12 13

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)