# Articles by R on Stats and R

### The 9 concepts and formulas in probability that every data scientist should know

March 2, 2020 |

What is probability? 1. A probability is always between 0 and 1 2. Compute a probability 3. Complement of an event 4. Union of two events 5. Intersection of two events 6. Independence of two events 7. Conditional probability Bayes’ theorem Example 8. Accuracy measures False negatives False positives Sensitivity Specificity Positive predictive value Negative predictive value 9. Counting techniques Multiplication ... ### The 9 concepts and formulas in probability that every data scientist should know

March 2, 2020 |

What is probability? 1. A probability is always between 0 and 1 2. Compute a probability 3. Complement of an event 4. Union of two events 5. Intersection of two events 6. Independence of two events 7. Conditional probability Bayes’ theorem Example 8. Accuracy measures False negatives False positives Sensitivity Specificity Positive predictive value Negative predictive value 9. Counting techniques Multiplication ... ### Student’s t-test in R and by hand: how to compare two groups under different scenarios

February 27, 2020 |

Introduction Null and alternative hypothesis Hypothesis testing Different versions of the Student’s t-test How to compute Student’s t-test by hand? Scenario 1: Independent samples with 2 known variances Scenario 2: Independent samples with 2 equal but unknown variances Scenario 3: Independent samples with 2 unequal and unknown variances Scenario 4: Paired samples where the variance ... ### Student’s t-test in R and by hand: how to compare two groups under different scenarios

February 27, 2020 |

Introduction Null and alternative hypothesis Hypothesis testing Different versions of the Student’s t-test How to compute Student’s t-test by hand? Scenario 1: Independent samples with 2 known variances Scenario 2: Independent samples with 2 equal but unknown variances Scenario 3: Independent samples with 2 unequal and unknown variances Scenario 4: Paired samples where the variance ... ### Correlogram in R: how to highlight the most correlated variables in a dataset

February 21, 2020 |

Introduction Correlation matrix Correlogram Correlation test Code {lares} package All possible correlations Correlation of one variable against all others Reference Introduction Correlation, often computed as part of descriptive statistics, i... ### Correlogram in R: how to highlight the most correlated variables in a dataset

February 21, 2020 |

Introduction Correlation matrix Correlogram Correlation test Code Photo by Pritesh Sudra Introduction Correlation, often computed as part of descriptive statistics, is a statistical tool used to study the relationship between two variables, ... ### Getting started in R markdown

February 17, 2020 |

R Markdown: what, why and how? Before you start Components of a .Rmd file YAML header Code chunks Text Code inside text Highlight text like it is code Images Tables Additional notes and useful resources If you have spent some time writing cod... ### Getting started in R markdown

February 17, 2020 |

R Markdown: what, why and how? Before you start Components of a .Rmd file YAML header Code chunks Text Code inside text Images Tables Additional notes and useful resources Photo by Jon Tyson If you have spent some time writing code in R, you probably have heard of generating dynamic ... ### The complete guide to clustering analysis: k-means and hierarchical clustering by hand and in R

February 12, 2020 |

What is clustering analysis? Application 1: Computing distances Solution k-means clustering Application 2: k-means clustering Data kmeans() with 2 groups Quality of a k-means partition nstart for several initial centers and better stability kmeans() with 3 groups Optimal number of clusters Elbow method Silhouette method Gap statistic method NbClust() Visualizations Manual application and verification ... ### The complete guide to clustering analysis: k-means and hierarchical clustering by hand and in R

February 12, 2020 |

What is clustering analysis? Application 1: Computing distances Solution k-means clustering Application 2: k-means clustering Data kmeans() with 2 groups Quality of a k-means partition nstart for several initial centers kmeans() with 3 groups Manual application and verification in R Solution by hand Solution in R Hierarchical clustering Application 3: hierarchical clustering Data Solution by ... ### An efficient way to install and load R packages

January 30, 2020 |

What is a R package and how to use it? Inefficient way to install and load R packages More efficient way Most efficient way {pacman} package {librarian} package What is a R package and how to use it? Unlike other programs, only fundamental fu... ### An efficient way to install and load R packages

January 30, 2020 |

What is a R package and how to use it? Inefficient way to install and load R packages More efficient way What is a R package and how to use it? Unlike other programs, only fundamental functionalities come by default with R. You will thus often need to install some “... [Read more...]

### Do my data follow a normal distribution? A note on the most widely used distribution and how to test for normality in R

January 28, 2020 |

What is a normal distribution? Empirical rule Parameters Probabilities and standard normal distribution Areas under the normal distribution in R and by hand Ex. 1 In R By hand Ex. 2 In R By hand Ex. 3 In R By hand Ex. 4 In R By hand Ex. 5 Why... ### Do my data follow a normal distribution ? A note on the most widely used distribution and how to test for normality in R

January 28, 2020 |

What is a normal distribution? Empirical rule Parameters Probabilities and standard normal distribution Areas under the normal distribution in R and by hand Ex. 1 In R By hand Ex. 2 In R By hand Ex. 3 In R By hand Ex. 4 In R By hand Ex. 5 Why is the normal distribution so ... ### Fisher’s exact test in R: independence test for a small sample

January 27, 2020 |

Introduction Hypotheses Example Data Observed frequencies Expected frequencies Fisher’s exact test in R Conclusion and interpretation References Introduction After presenting the Chi-square test of independence by hand and in R, this article ... ### Fisher’s exact test in R: independence test for a small sample

January 27, 2020 |

Introduction Hypotheses Example Data Observed frequencies Expected frequencies Fisher’s exact test in R Conclusion and interpretation References Introduction After presenting the Chi-square test of independence by hand and in R, this article focuses on the Fisher’s exact test. Independence tests are used to determine if there is a ... [Read more...]

### Chi-square test of independence in R

January 26, 2020 |

Introduction Data Chi-square test of independence in R Conclusion and interpretation Combination of plot and statistical test Introduction This article explains how to perform the Chi-square test of independence in R and how to interpret its r... ### Chi-square test of independence in R

January 26, 2020 |

Introduction Example Data Chi-square test of independence Conclusion and interpretation Introduction This article explains how to perform the Chi-square test of independence in R and how to interpret its results. To learn more about how the test works and how to do it by hand, I invite you to read ... ### How to create a timeline of your CV in R?

January 25, 2020 |

Introduction Minimal reproducible example How to personalize it Additional note Introduction In this article, I show how to create a timeline of your CV in R. A CV timeline illustrates key information about your education, work experiences and...  