# Fitting a Distribution to Data in R

**Steve's Data Tips and Tricks**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

# Introduction

The gamma distribution is a continuous probability distribution that is often used to model waiting times or other positively skewed data. It is a two-parameter distribution, where the shape parameter controls the skewness of the distribution and the scale parameter controls the spread of the distribution.

# Fitting a gamma distribution to a dataset in R

There are two main ways to fit a gamma distribution to a dataset in R:

**Maximum likelihood estimation (MLE)**: This method estimates the parameters of the gamma distribution that are most likely to have produced the observed data.**Method of moments:**This method estimates the parameters of the gamma distribution by equating the sample mean and variance to the theoretical mean and variance of the gamma distribution.

**MLE** is the more common and generally more reliable method of fitting a gamma distribution to a dataset. To fit a gamma distribution to a dataset using MLE, we can use the `fitdist()`

function from the `fitdistrplus`

package.

# Install the fitdistrplus package if necessary #install.packages("fitdistrplus") # Load the fitdistrplus package library(fitdistrplus) library(TidyDensity) set.seed(123) data <- tidy_gamma(.n = 500)$y # Fit a gamma distribution to the data fit <- fitdist(data, distr = "gamma", method = "mle")

The `fit`

object contains the estimated parameters of the gamma distribution, as well as other information about the fit. We can access the estimated parameters using the `coef()`

function. Now the `tidy_gamma()`

function from the TidyDensity package comes with a default setting of a `.scale = 0.3`

and `shape = 1`

. The rate is `1/.scale`

, so by default it is 3.33333

# Get the estimated parameters of the gamma distribution coef(fit)

shape rate 1.031833 3.594773

Now let’s see how that compares to the built in TidyDensity function:

util_gamma_param_estimate(data)$parameter_tbl[1,c("shape","scale","shape_ratio")]

# A tibble: 1 × 3 shape scale shape_ratio <dbl> <dbl> <dbl> 1 0.983 0.292 3.36

In the above, the `shape_ratio`

is the `rate`

# Try on your own!

I encourage you to try fitting a gamma distribution to your own data. You can use the `fitdistrplus`

package in R to fit a gamma distribution to any dataset. Once you have fitted a gamma distribution to your data, you can use the estimated parameters to generate random samples from the gamma distribution or to calculate the probability of observing a particular value.

**leave a comment**for the author, please follow the link and comment on their blog:

**Steve's Data Tips and Tricks**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.