### Why do I have a data science blog? 7 benefits of sharing your code

September 1, 2020 |

#1 Learn by writing #2 Get feedback #3 Personal note to remind my future self #4 Contribute to the open source community #5 Stay humble, stay curious #6 Learn to be less perfectionist and to prioritize #7 Build connections and professional relation...

### Graphics in R with ggplot2

August 20, 2020 |

Introduction Data Basic principles of {ggplot2} Create plots with {ggplot2} Scatter plot Line plot Combination of line and points Histogram Density plot Combination of histogram and densities Boxplot Barplot Further personalization Labels Axis ticks Log transformations Limits Legend Shape, color, size and transparency Smooth and regression lines Facets Themes Interactive ...

### Mortgage calculator in R Shiny

August 13, 2020 |

Introduction Mortgage calculator How to use the mortgage calculator? Code of the app Introduction I recently moved out and bought my first apartment. Of course, I could not pay it entirely with my own savings, so I had to borrow money from the...

### Outliers detection in R

August 10, 2020 |

Introduction Descriptive statistics Minimum and maximum Histogram Boxplot Percentiles Hampel filter Statistical tests Grubbs’s test Dixon’s test Rosner’s test Additional remarks Introduction An outlier is a value or an observation that is distant from other observations, that is to say, a data point that differs significantly from ...

### Wilcoxon test in R: how to compare 2 groups under the non-normality assumption

June 6, 2020 |

Introduction 2 different scenarios Independent samples Paired samples Introduction In a previous article, we showed how to compare two groups under different scenarios using the Student’s t-test. The Student’s t-test requires that the distributions follow a normal distribution1. In this article, we show how to compare two groups when ...

### How to publish a Shiny app: example with shinyapps.io

May 28, 2020 |

Introduction Prerequisite Step-by-step guide Additional notes Introduction The COVID-19 virus led many people to create interactive apps and dashboards. A reader recently asked me how to publish a Shiny app she just created. Similarly to a pre...

### Correlation coefficient and correlation test in R

May 27, 2020 |

Introduction Data Correlation coefficient Between two variables Correlation matrix: correlations for all variables Interpretation of a correlation coefficient Visualizations A scatterplot for 2 variables Scatterplots for several pairs of variables Another simple correlation matrix Correlation test For 2 variables For several pairs of variables Combination of correlation coefficients and correlation tests Introduction ...

### How to upload your R code on GitHub: example with an R script on MacOS

May 23, 2020 |

Introduction Prerequisite Step-by-step guide Additional notes Introduction Few days ago, a colleague asked me how to upload some R code on GitHub in order to make it accessible to everyone. Due to the lockdown, I could not just go into his offi...

### COVID-19 in Belgium: is it over yet?

May 21, 2020 |

Introduction New hospital admissions New confirmed cases Introduction This is a joint work with Prof. Niko Speybroeck and Angel Rosas-Aguirre. Belgium recently started to lift its lockdown measures initially imposed to contain the spread of the...

### One-proportion and goodness of fit test (in R and by hand)

May 12, 2020 |

Introduction In R Data One-proportion test Assumption of prop.test() and binom.test() Chi-square goodness of fit test Does my distribution follow a given distribution? Observed frequencies Expected frequencies Observed vs. expected frequencies By hand One-proportion test Verification in R Goodness of fit test Verification in R Introduction In a ...

April 25, 2020 |

Introduction Installation Download all books at once Create a table of Springer books Download only specific books By title By author By subject Acknowledgments Introduction You probably already have seen that Springer released about 500 books for free following the COVID-19 pandemic. According to Springer, these textbooks will be available free ...

### COVID-19 in Belgium

March 30, 2020 |

Introduction Top R resources on Coronavirus Coronavirus dashboard for your own country Motivations, limitations and structure of the article Analysis of Coronavirus in Belgium A classic epidemiological model: the SIR model Fitting a SIR model to the Belgium data Reproduction number \(R_0\) Using our model to analyze the outbreak if ...

### How to create a simple Coronavirus dashboard specific to your country in R

March 22, 2020 |

Introduction Top R resources on Coronavirus Coronavirus dashboard: the case of Belgium How to create your own Coronavirus dashboard Additional notes Data Open source Accuracy Coronavirus dashboard: the case of Belgium Introduction The Novel COVID-19 Coronavirus is the hottest topic right now. Every day, the media and newspapers share the ...

### How to do a t-test or ANOVA for many variables at once in R and communicate the results in a better way

March 18, 2020 |

Introduction Perform multiple tests at once Concise and easily interpretable results T-test ANOVA To go even further Photo by Teemu Paananen Introduction As part of my teaching assistant position in a Belgian university, students often ask me for some help in their statistical analyses for their master’s thesis. A ...

### Top 5 R resources on COVID-19 Coronavirus

March 11, 2020 |

R Shiny apps Coronavirus tracker COVID-19 outbreak R packages {nCov2019} R code Analyzing COVID-19 outbreak data with R COVID-19 Data Analysis with {tidyverse} and {ggplot2} Data Photo by CDC The Coronavirus is a serious concern around the...

### How to perform a one sample t-test by hand and in R: test on one mean

March 8, 2020 |

Introduction Null and alternative hypothesis Hypothesis testing Two versions of the one sample t-test How to compute the one sample t-test by hand? Scenario 1: variance of the population is known Scenario 2: variance of the population is unknown How to compute the one sample t-test in R? Scenario 1: variance of the ...

### The 9 concepts and formulas in probability that every data scientist should know

March 2, 2020 |

What is probability? 1. A probability is always between 0 and 1 2. Compute a probability 3. Complement of an event 4. Union of two events 5. Intersection of two events 6. Independence of two events 7. Conditional probability Bayes’ theorem Example 8. Accuracy measures False negatives False positives Sensitivity Specificity Positive predictive value Negative predictive value 9. Counting techniques Multiplication ...

### Student’s t-test in R and by hand: how to compare two groups under different scenarios

February 27, 2020 |

Introduction Null and alternative hypothesis Hypothesis testing Different versions of the Student’s t-test How to compute Student’s t-test by hand? Scenario 1: Independent samples with 2 known variances Scenario 2: Independent samples with 2 equal but unknown variances Scenario 3: Independent samples with 2 unequal and unknown variances Scenario 4: Paired samples where the variance ...

### Correlogram in R: how to highlight the most correlated variables in a dataset

February 21, 2020 |

Introduction Correlation matrix Correlogram Correlation test Code Photo by Pritesh Sudra Introduction Correlation, often computed as part of descriptive statistics, is a statistical tool used to study the relationship between two variables, ...