Survival analysis is used to predict the time until an event of interest occurs. In this post, I show how to use scikit-learn, xgboost, lightgbm, and conformal prediction for probabilistic survival analysis
During a recent class a student asked whether bootstrap confidence intervals were more robust than confidence intervals estimated using the standard error (i.e. \(SE = \frac{s}{\sqrt{n}}\)). In order to answer this question I wrote a function to simulate taking a bunch of random samples from a population, ... [Read more...]
A recent conversation with a colleague about a large stepped-wedge design (SW-CRT) cluster randomized trial piqued my interest, because the primary outcome is time-to-event. This is not something I’ve seen before. A quick dive into the literature su...
With field experiments, studying the correlation between the observed traits may not be an easy task. For example, we can consider a genotype experiment, laid out in randomised complete blocks, with 27 wheat genotypes and three replicates, where sev... [Read more...]
This post summarizes an extended period of deep annoyance. I have tried to solve the problem it describes more than once before and not quite done it. This has, in fact, happened again. I have still not satisfactorily solved the problem. But this time I know why I can’t ... [Read more...]
Simpson’s paradox is when a trend that is present in various groups of data seems to disappear or even reverse when those groups are combined. One sees examples of this often in things like medical trials, and the phenomenon is generally due to ...
Are you thinking about submitting a talk for ShinyConf 2025 but unsure where to start? Here are some insights from our ShinyConf Planning Committee on what makes a great proposal. Call for Speakers is almost coming to a close (Deadline has been extended to February 9th), so here are tips to ...
In this post, I would like to draw attention to a very interesting data set collected by Guan, Palma and Wu as part of the replication package for their paper The rise and fall of paper money in Yuan China, 1260-1368. The paper describes inflation, mon...
We have developed a new version of the gDefrag package. The previous version has been retired from CRAN due to dependencies on outdated packages. This updated version is still under development and may contain limitations. Please exercise caution when using it. I have just uploaded a new version of the ...
At rOpenSci, our mission is to
foster a culture that values open and reproducible research using shared data and reusable software
and we achieve this through
creating social infrastructure through a welcoming and diverse community
We feel it is a... [Read more...]
First of all, let’s start with a definition of what we mean by monochrome (or monochromatic). Creating a monochrome chart essentially means only using different shades of one colour. In most cases, this means different shades of grey (or black an...
Publishing is an integral part of the data analysis process. Whether it’s in the form of
code, reports or technical documentation, at some point artifacts need to be shared. More often than
not, such artifacts are confidential and their access n...
Battling first child amnesia
I am a father of two sons; one 4.5 years old, and the other is but a few months. This may seem weird, but even though I went through everything with my first son… I have complete amnesia about what was normal, what ...
You can read the original post in its original format on Rtask website by ThinkR here: Rlinguo — Why Did We Build It?
Header image via ChatGPT We recently released something unprecedented: an app that brings R to mobile. Yes, you read that right — R, the statistical programming language, is now ... [Read more...]
Most of our work in Epiverse TRACE involves either developing an R package from scratch or adopting and maintaining an existing R package. In the former case, decision-making during development is guided by internal policies documented in the Ep... [Read more...]
When working with statistical data, ensuring that certain assumptions are met is critical to the validity of your results. One such assumption is the homogeneity of variance, which refers to the idea that the variability within groups should be consistent across all groups being compared. But how do you test ...
I was just notified by CRAN that choroplethr is scheduled to be archived on February 12. The reason is that choroplethr depends on the acs package, and the acs package is being archived. Apparently when a package is archived from CRAN, all packages which use it are also archived. I am ... [Read more...]
Nuevo vídeo de R en Español sobre cómo Crear Funciones.
Este vídeo es parte de mi playlist R Desde Ceros que pretende enseñar los aspectos mas básicos del uso de R para generar las bases para programación y análisis de datos.
En ... [Read more...]