I’ve been looking at the following paper, by researchers at Harvard’s school of public health, which was recently published in Science: Kissler, Tedijanto, Goldstein, Grad, and Lipsitch (2020) Projecting the transmission dynamics of SARS-CoV-2 through the postpandemic period (also available here, with supplemental materials here). This is one of ...
Multinomial logistic regression is used when the target variable is categorical with more than two levels. It is an extension of binomial logistic regression. [Read more...]
Binary Logistic Regression is used to explain the relationship between the categorical dependent variable and one or more independent variables. When the dependent variable is dichotomous, we use binary logistic regression. However, by default, a binary logistic regression is almost always called logistics regression. Overview – Binary Logistic Regression The logistic ...
In this tutorial on purrr package in R, you will learn how to use functions from the purrr package in R to improve the quality of your code and understand the advantages of purrr functions compared to equivalent base R functions.
Data science is one of the most wide-ranging disciplines of the 21st century. Data scientists use a wide variety of methods and tools to generate more knowledge from data and its analysis. Especially in times like today, data and the insights we can draw from them are becoming increasingly important. ...
Data Scientists Face an Existential Crisis
The term data scientist has always been a bit controversial. William Cleveland coined the term in 2001 to advocate the practical use of statistics in other technical fields and believed that use warranted a n...
Today we’re excited to announce the general release of RStudio 1.3. This release features many major improvements to the IDE, including:
Dramatically improved accessibility for sight-impaired users, which also upgrades keyboard navigation, contr... [Read more...]
Myself and Henrique Martins (PUC Rio) organized a call for papers on data reuse, for publication in RAC – Revista de Administração Contemporanea. The deadline for submission is 10th october 2020, with expected publication date in july 2021.
We will... [Read more...]
Survival Analysis
Survival analysis deals with estimating probability of continuation of a particular status-quo at given point in time. Naturally, it also estimates the probability of discontinuation of the status quo i.e. occurance of an event or a hazard. It finds application in several fields. For e.g. in ...
In my last post I used the optim() command to optimise a linear regression model. In this post, I am going to take that approach a little further and optimise a logistic regression model in the same manner. Thanks to John C. Nash, I got a first glimpse into the ... [Read more...]
Lately I’ve been publishing
screencasts demonstrating how to use the
tidymodels framework, from first steps in modeling to how to evaluate complex models. Today’s screencast isn’t about predictive modeling, but about unsupervised machine learning using with this week’s
#TidyTuesday dataset on cocktail recipes. ????
Here is the ...
Introduction
The idea behind this post was to play and discover some of the info contained in the COVID19 R package which collects data across several governmental sources.This package is being developed by the Guidotti and Ardia from COVID19 Data Hub.
Later, I will add to the analysis the ...
QBits Workspace: A New Online Editor to Share and Deploy R Code
Today we are excited to announce the QBits Workspace to run and deploy R code in the browser. QBits enable you to run R in a serverless cloud environment and provide an easy and cost-effective way to develop, ... [Read more...]
As you are likely aware by now, the dplyr 1.0.0 release is right around the corner. I am very excited about this huge milestone for dplyr. In this post, we’ll go over my favorite new features coming in the 1.0.0 release.
# Install development version of dplyr
remotes::install_github(
"tidyverse/dplyr",
ref = "23c166fa7cc247f0ee1a4ee5ac31cd19dc63868d"
)
Note: in the above call to install_github(), I ...
COVID-19 disease spread hit the World really globally and also the field of mathematicians/ statisticians/ machine learning researchers and related.
These experts want to help to understand for example future trends (forecast) of the coronavirus spread...
This post will look at how to fit an XGBoost model using the tidymodels framework rather than using the XGBoost package directly.
Tidymodels is a collection of packages that aims to standardise model creation by providing commands that can be applied across different R packages. For example, once the code ...
Hexagon tessellation using the great geogrid package. The départements are the second level of administrative government in France. They neither have the same area nor the same population and this heterogeneity provides a few challenges for a fair and accurate map representation (see the post on smoothing). However if ... [Read more...]
Corona has put us in an awkward situation, where we must rethink and revise our ways of doing things (teaching, working, baby sitting, balancing work and life, or any other related field of your choosing). I also see this as an opportunity to exper... [Read more...]
Corona has put us in an awkward situation, where we must rethink and revise our ways of doing things (teaching, working, baby sitting, balancing work and life, or any other related field of your choosing). I also see this as an opportunity to experi... [Read more...]
May 28th (8:00pm UTC+2) will bring another fascinating Webinar at Why R? Foundation. We will have a joint talk by Bernd Bischl, Florian Pfisterer and Martin Binder about Pipelines and AutoML with mlr3.
See you on the Webinar!
Details
donate: why... [Read more...]