Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

A very common task in data processing is the transformation of the numeric variables (continuous, discrete etc) to categorical by creating bins. For example, is quite ofter to convert the `age` to the `age group`. Let’s see how we can easily do that in R.

We will consider a random variable from the Poisson distribution with parameter λ=20

```library(dplyr)
# Generate 1000 observations from the Poisson distribution
# with lambda equal to 20
df<-data.frame(MyContinuous = rpois(1000,20))

# get the histogtam
hist(df\$MyContinuous)

```

## Create specific Bins

Let’s say that you want to create the following bins:

• Bin 1: (-inf, 15]
• Bin 2: (15,25]
• Bin 3: (25, inf)

We can easily do that using the `cut` command. Let’s start:

```df<-df%>%mutate(MySpecificBins = cut(MyContinuous, breaks = c(-Inf,15,25,Inf)))

```

Let’s have a look at the counts of each bin.

```df%>%group_by(MySpecificBins)%>%count()

```

Notice that you can define also you own labels within the `cut` function.

## Create Bins based on Quantiles

Let’s say that you want each bin to have the same number of observations, like for example 4 bins of an equal number of observations, i.e. 25% each. We can easily do it as follows:

```numbers_of_bins = 4

df<-df%>%mutate(MyQuantileBins = cut(MyContinuous,
breaks = unique(quantile(MyContinuous,probs=seq.int(0,1, by=1/numbers_of_bins))),
include.lowest=TRUE))

```

We can check the `MyQuantileBins` if contain the same number of observations, and also to look at their ranges:

```df%>%group_by(MyQuantileBins)%>%count()

```

Notice that in case that you want to split your continuous variable into bins of equal size you can also use the `ntile` function of the `dplyr` package, but it does not create labels of the bins based on the ranges.