Articles by Theory meets practice...

Pair Programming Statistical Analyses

September 1, 2017 | Theory meets practice...

Abstract Control calculation ping-pong is the process of iteratively improving a statistical analysis by comparing results from two independent analysis approaches until agreement. We use the daff package to simplify the comparison of the two results and illustrate its use by a case study with two statisticians ping-ponging an analysis ... [Read more...]

Confidence Intervals without Your Collaborator’s Tears

June 21, 2017 | Theory meets practice...

Abstract We provide an interpretation for the confidence interval for a binomial proportion hidden as the transcript of an hypothetical statistical consulting session. This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. The markdown+Rknitr source code of this blog is available under a GNU General Public License (... [Read more...]

Estimating the Size of a Demonstration

May 3, 2017 | Theory meets practice...

Abstract Inspired by the recent March For Science we look into methods for the statistical estimation of the number of people participating in a demonstration organized as a march. In particular, we provide R code to reproduce the two on-the-spot counting method analysis of Yip et al. (2010) for the data ...
[Read more...]

On a First Name Basis with Statistics Sweden

March 24, 2017 | Theory meets practice...

Abstract Jugding from recent R-Bloggers posts, it appears that many data scientists are concerned with scraping data from various media sources (Wikipedia, twitter, etc.). However, one should be aware that well structured and high quality datasets are available through state's and country's bureau of statistics. Increasingly these are offered to ... [Read more...]

Did Mary and John go West?

March 5, 2017 | Theory meets practice...

Abstract As a final post in the baby-names-the-data-scientist's-way series, we use the US Social Security Administration 1910-2015 data to space-time visualize for each the most popular baby name for girls and boys, respectively. The code uses in parts the new simple features package (sf) in order to to get some ... [Read more...]

US Babyname Collisions 1880-2014

February 28, 2017 | Theory meets practice...

Abstract We use US Social Security Administration data to compute the probability of a name clash in a class of year-YYYY born kids during the years 1880-2014. This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. The markdown+Rknitr source code of this blog is available under a ... [Read more...]

Happy pbirthday class of 2016

February 12, 2017 | Theory meets practice...

Abstract Continuing the analysis of first names given to newborns in Berlin 2016, we solve the following problem: what is the probability, that in a school class of size \(n\) of these kids there will be at least two kids having the same first name? The answer to the problem for ... [Read more...]

Naming Uncertainty by the Bootstrap

February 5, 2017 | Theory meets practice...

Abstract Data on the names of all newborn babies in Berlin 2016 are used to illustrate how a scientific treatment of chance could enhance rank statements in, e.g., onomastics investigations. For this purpose, we first identify different stages of the naming-your-baby process, which are influenced by chance. Second, we compute ... [Read more...]

suRprise! – Classifying Kinder Eggs by Boosting

December 22, 2016 | Theory meets practice...

Abstract Carrying the Danish tradition of Juleforsøg to the realm of statistics, we use R to classify the figure content of Kinder Eggs using boosted regression trees for the egg's weight and possible rattling noises. This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. The markdown+Rknitr ... [Read more...]

4×3 R-Hackathoning – The Finisher’s Guide

December 11, 2016 | Theory meets practice...

Abstract We present experiences from organizing a small R hackathon aimed at advancing knowledge and documentation of the R package surveillance. The hackathon was piggybacked on the ESCAIDE2016 conference visited by current and potential package users in the area of infectious disease epidemiology. The output of the hackathon is available ... [Read more...]

Better Confidence Intervals for Quantiles

October 22, 2016 | Theory meets practice...

\[ \newcommand{\bm}[1]{\boldsymbol{\mathbf{#1}}} \DeclareMathOperator*{\argmin}{arg\,min} \DeclareMathOperator*{\argmax}{arg\,max} \] Abstract We discuss the computation of confidence intervals for the median or any other quantile in R. In particular we are interested in the interpolated order statistic approach suggested by Hettmansperger and Sheather (1986) and Nyblom (1992). In order to ... [Read more...]

Cartograms with R

October 9, 2016 | Theory meets practice...

Abstract We show how to create cartograms with R by illustrating the population and age-distribution of the planning regions of Berlin by static plots and animations. This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. The markdown+Rknitr source code of this blog is available under a GNU ... [Read more...]

The Olympic Medal Table Visualized Gapminder Style

August 20, 2016 | Theory meets practice...

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. The markdown+Rknitr source code of this blog is available under a GNU General Public License (GPL v3) license from . Abstract Following Hans Rosling's Gapminder animation style we visualize the total number of medals a country wins during each ...
[Read more...]

No Sleep During the Reproducibility Session

August 3, 2016 | Theory meets practice...

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. The markdown+Rknitr source code of this blog is available under a GNU General Public License (GPL v3) license from . Abstract R code is provided for implementing a statistical method by Nishiura, Miyamatsu, and Mizumoto (2016) to assess when to ...
[Read more...]

Casting Call for MERS-CoV in Korea, 2015

July 18, 2016 | Theory meets practice...

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. The markdown+Rknitr source code of this blog is available under a GNU General Public License (GPL v3) license from . Abstract We perform an adjustment for observed-but-not-yet-reported cases (aka. nowcasting) for the epidemic curve of the Middle East respiratory ...
[Read more...]

Princes Disguised in Uniforms

June 18, 2016 | Theory meets practice...

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. The markdown+Rknitr source code of this blog is available under a GNU General Public License (GPL v3) license from . Abstract We revisit the secretary problem as a mathematical fairy tale: Princes wooing a princess sequentially arrive each having ...
[Read more...]
1 2 3

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)