July 2021

Predict which #TidyTuesday Scooby Doo monsters are REAL with a tuned decision tree model

July 12, 2021 | rstats | Julia Silge

This is the latest in my series of screencasts demonstrating how to use the tidymodels packages, from just getting started to tuning more complex models. Today’s screencast walks through how to train and evalute a random forest model, with this week’...

visualizing topic models with crosstalk

July 12, 2021 | Jason Timm

Introduction A simple post detailing the use of the crosstalk package to visualize and investigate topic model results interactively. As an example, we investigate the topic structure of correspondences from the Founders Online cor... [Read more...]

How to become a better R code detective?

July 12, 2021 | Maëlle's R blog on Maëlle Salmon's personal website

Huge thanks to Hannah Frick for her useful feedback on this post! Vielen Dank! When trying to fix a bug or add a feature to an R package, how do you go from viewing the code as a big messy ball of wool, to a logical diagram that you can ... [Read more...]

New package katex: rendering math to HTML and MathML in R

July 12, 2021 | rOpenSci - open tools for open science

A new rOpenSci package katex is now on CRAN. This package allows for converting latex math expressions to HTML and MathML for use in markdown documents or package documentation. The R package uses the katex javascript library, but the rendering is don...

Asymptotic confidence intervals for NLS regression in R

July 12, 2021 | R-bloggers | A Random Walk

Introduction Nonlinear regression model As a model setup, we consider noisy observations \(y_1,\ldots, y_n \in \mathbb{R}\) obtained from a standard nonlinear regression model of the form: \[ \begin{aligned} y_i &\ = \ f(\boldsymbol{x}_i, \boldsymbol{\theta}) + \epsilon_i, \quad i = 1,\ldots, n \end{aligned} \] where \(f: \mathbb{...

Foghorn package – find out pending CRAN packages in the pipeline

July 12, 2021 | Gary Hutson

I have recently just pushed my fourth package to CRAN, I will do a separate post on this, but the FeatureTerminatoR package has been built to perform automated feature selection, utilising methods such as recursive partitioning, multicollinearity purging and other types will be built into the second version. Installing the ...

Little useless-useful R functions – Is it raining yet?

July 12, 2021 | tomaztsql

Summer. Sunny weather. Vitamin D. And if you are missing vitamin Rain because you are growing a garden or simply want an inner cooling, hit this useless function to see, if there is a rain anywhere to be seen. Besides…Read more ›

R is for Research, Python is for Production

July 11, 2021 | Business Science

👉 Sign Up For More Blog Articles 👈 Updated July 2021 Both R and Python are great. We’ll showcase some of the strengths of each language in this article by showcasing where the major development efforts are within each ecosystem. ...

ggplot: plot only some of the data

July 11, 2021 | R on I Should Be Writing

Often (especially when working with large and/or rich datasets) our (gg)plots can feel cluttered with information. But they don’t have to be! Let’s look at the following plot: Generate some data

library(dplyr)

bfi <- psychTools::bfi %>% 
  mutate(
    O = across(starts_with("O")) %>% rowMeans(na.rm = TRUE),
    C = across(starts_with("C")) %>% rowMeans(na.rm = TRUE),
    E = across(starts_with("E")) %>% rowMeans(na.rm = TRUE),
    A = across(starts_with("A")) %>% rowMeans(na.rm = TRUE),
    N = across(starts_with("N")) %>% rowMeans(na.rm = TRUE)
  ) %>% 
  mutate(
    gender = factor(gender, labels = c("Man", "Woman")),
    education = factor(education, labels = c("HS", "finished HS", "some college", "college graduate", "graduate degree"))
  ) %>% 
  select(gender, education, age, O:N) %>% 
  tidyr::drop_na(education) %>% 
  # multiply the data set
  sample_n(size = 10000, replace = TRUE) %>% 
  # and add some noise
  mutate(across(O:N, \(x) x + rnorm(x, 0, sd(x))))

library(ggplot2)

theme_set(theme_bw())

base_plot <- ggplot(bfi, aes(age, O, color = education)) + 
  facet_wrap(facets = vars(gender)) + 
  coord_cartesian(ylim = c(1, 6)) + 
  scale_color_viridis_d()

base_plot + 
  geom_point(shape = 16, alpha = 0.1) + 
  geom_smooth(se = FALSE)

This is a busy plot. It’s hard to see what the each ...

Shiny, Tableau, and PowerBI: Better Business Intelligence

July 11, 2021 | RStudio | Open source & professional software for data science teams on RStudio

This is a guest post from Marcin Dubel, a 2021 Shiny Contest Grand Prize winner and Software Engineer at Appsilon, a Full Service RStudio Partner. Finding The Right Tool For The Job With strong competition in the Business Intelligence market, choosing ...

UseR2021: Integrating R into Production

July 11, 2021 | Roel M. Hogervorst

This year’s useR was completely online, and I watched many of the talks. I believe the videos will be public in the future but there were some talks that I wanted to highlight. I think that the biggest problem with machine learning- (or even data... [Read more...]

ggplot: plot only some of the data

July 11, 2021 | R on I Should Be Writing

Often (especially when working with large and/or rich datasets) our (gg)plots can feel cluttered with information. But they don’t have to be! Let’s look at the following plot: Generate some data

library(dplyr)

bfi <- psychTools::bfi %>% 
  mutate(
    O = across(starts_with("O")) %>% rowMeans(na.rm = TRUE),
    C = across(starts_with("C")) %>% rowMeans(na.rm = TRUE),
    E = across(starts_with("E")) %>% rowMeans(na.rm = TRUE),
    A = across(starts_with("A")) %>% rowMeans(na.rm = TRUE),
    N = across(starts_with("N")) %>% rowMeans(na.rm = TRUE)
  ) %>% 
  mutate(
    gender = factor(gender, labels = c("Man", "Woman")),
    education = factor(education, labels = c("HS", "finished HS", "some college", "college graduate", "graduate degree"))
  ) %>% 
  select(gender, education, age, O:N) %>% 
  tidyr::drop_na(education) %>% 
  # multiply the data set
  sample_n(size = 10000, replace = TRUE) %>% 
  # and add some noise
  mutate(across(O:N, \(x) x + rnorm(x, 0, sd(x))))

library(ggplot2)

theme_set(theme_bw())

base_plot <- ggplot(bfi, aes(age, O, color = education)) + 
  facet_wrap(facets = vars(gender)) + 
  coord_cartesian(ylim = c(1, 6)) + 
  scale_color_viridis_d()

base_plot + 
  geom_point(shape = 16, alpha = 0.1) + 
  geom_smooth(se = FALSE)

This is a busy plot. It’s hard to see what the each ...

simplevis: making leaflet sf maps

July 11, 2021 | David Hodge

Introduction In addition to gglot2 wrapper functions, simplevis also provides leaflet wrapper functions as a bonus. The way these functions have been designed is to follow the logic of the ggplot2 wrapper functions.

library(simplevis)
library(dplyr)
library(palmerpenguins)

sf objects The sf package makes it easy to work with vector data (e.g. points, ...

How to Create a Covariance Matrix in R

July 11, 2021 | finnstats

Covariance Matrix in R, Covariance is a measure of the degree to which two variables are linearly associated. We can measure how changes in... The post How to Create a Covariance Matrix in R appeared first on finnstats.

GooglyPlusPlus2021 adds new bells and whistles!!

July 11, 2021 | Tinniam V Ganesh

This latest update of GooglyPlusPlus2021 includes new controls which allow for granular analysis of teams and matches. This version includes a new ‘Date Range’ widget which will allow you to choose a specific interval between which you would like to analyze data. The Date Range widget has been added to 2 ...

Ellsberg’s Paradox

July 10, 2021 | R on Harshvardhan

I was reading the book “How Not to Be Wrong: The Power of Mathematical Thinking” by Jordan Ellenberg. The book introduces a paradox named after Daniel Ellsberg, a young analyst at RAND Corporation and famous for leaking the Pentagon papers ...

The UEFA EURO 2020 prediction winner is …

July 10, 2021 | R blog posts on sandsynligvis.dk

… going to be revealed just below. This post from June 2nd shows the original announcement. Each contestant was asked to submit a prediction that should be a 6 x 24 matrix where the columns represent the countries, and the r... [Read more...]

Adding lines or other geoms to a ggplot by calling a custom function

July 10, 2021 | rstats-tips.net

Sometimes you generate lots of ggplots of a similar kind, e.g. visualizations of different timeseries. Then you want to highlight some dates where something special had happened and you want to show that the value you are plotting changed at these date...

Deploying Shiny Apps to Heroku with Docker From the Command Line

July 10, 2021 | Peter Solymos

Heroku is a cloud platform-as-a-service (PaaS) to deploy apps without worrying about infrastructure. Yes, this includes Shiny apps!

Mixing centered and non-centered parameterizations in a hierarchical model with PyMC3

July 10, 2021 | Posts | Joshua Cook

Background In his post on hierarchical models, Michael Betancourt goes in-depth on the funnel pathologies that often plague hierarchical modeling. My goal here is to reproduce his analysis in PyMC3 and explore these problems and their solutions. A hierarchical model is one that simultaneously models data from individual distributions and ...

« 1 … 5 6 7 8 9 10 »

Copyright © 2023 | MH Corporate basic by MH Themes