# How to Convert Continuous variables into Categorical by Creating Bins

**R – Predictive Hacks**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

A very common task in data processing is the transformation of the numeric variables (continuous, discrete etc) to categorical by creating bins. For example, is quite ofter to convert the ** age** to the

**. Let’s see how we can easily do that in R.**

`age group`

We will consider a random variable from the Poisson distribution with parameter **λ=20**

library(dplyr) # Generate 1000 observations from the Poisson distribution # with lambda equal to 20 df<-data.frame(MyContinuous = rpois(1000,20)) # get the histogtam hist(df$MyContinuous)

## Create specific Bins

Let’s say that you want to create the following bins:

**Bin 1: (-inf, 15]****Bin 2: (15,25]****Bin 3: (25, inf)**

We can easily do that using the `cut`

command. Let’s start:

df<-df%>%mutate(MySpecificBins = cut(MyContinuous, breaks = c(-Inf,15,25,Inf))) head(df,10)

Let’s have a look at the counts of each bin.

df%>%group_by(MySpecificBins)%>%count()

**Notice **that you can define also you own labels within the `cut`

function.

## Create Bins based on Quantiles

Let’s say that you want each bin to have the same number of observations, like for example 4 bins of an equal number of observations, i.e. 25% each. We can easily do it as follows:

numbers_of_bins = 4 df<-df%>%mutate(MyQuantileBins = cut(MyContinuous, breaks = unique(quantile(MyContinuous,probs=seq.int(0,1, by=1/numbers_of_bins))), include.lowest=TRUE)) head(df,10)

We can check the `MyQuantileBins`

if contain the same number of observations, and also to look at their ranges:

df%>%group_by(MyQuantileBins)%>%count()

Notice that in case that you want to split your continuous variable into bins of equal size you can also use the `ntile`

function of the `dplyr`

package, but it does not create labels of the bins based on the ranges.

**leave a comment**for the author, please follow the link and comment on their blog:

**R – Predictive Hacks**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.