Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The post droplevels in R with examples appeared first on Data Science Tutorials

droplevels in R with examples, To remove unneeded factor levels, use R’s droplevels() function.

This function comes in handy when we need to get rid of factor levels that are no longer in use as a result of subsetting a vector or a data frame.

The syntax for this function is as follows

`droplevels(x)`

where x is an object from which unused factor levels should be removed.

Count Observations by Group in R – Data Science Tutorials

This article shows you how to utilize this function in practice with a couple of examples.

## Example 1: Drop Unused Factor Levels in a Vector

Assume we have a data vector with seven-factor levels. Let’s say we create a new data vector using only five of the original seven-factor levels.

define data on a seven-factor scale

`data <- factor(c(1, 2, 3, 4, 5,6,7))`

original data minus 4th and 5th-factor levels = new data

`new <- data[-c(4, 5)]`

Now we can view the new data

```new
 1 2 3 6 7
Levels: 1 2 3 4 5 6 7```

Despite the fact that the new data only has five factors, we can see that the original seven-factor levels are still present.

How to perform the Kruskal-Wallis test in R? – Data Science Tutorials

We may use the droplevels() function to remove these unneeded factor levels.

remove any levels of factors that are no longer in use.

`new <- droplevels(new)`

Let’s view the data

```new
 1 2 3 6 7
Levels: 1 2 3 6 7```

There are now only five-factor levels in the new data.

## Example 2: Unused Factor Levels in a Data Frame Should Be Removed

Assume we’re working with a data frame in which one of the variables is a five-level factor.

Let’s say we create a new data frame that excludes two of these factor levels.

Checking Missing Values in R – Data Science Tutorials

Let’s create a data frame

```df <- data.frame(region=factor(c('P1', 'P2', 'P3', 'P4', 'P5')),
sales = c(103, 106, 202, 257, 324))
df
region sales
1     P1   103
2     P2   106
3     P3   202
4     P4   257
5     P5   324```

Now we can define a new data frame

`newdf <- subset(df, sales < 225)`

view new data frame

```newdf
region sales
1     P1   103
2     P2   106
3     P3   202```

Let’s check the levels of the region variable.

How to add labels at the end of each line in ggplot2? (datasciencetut.com)

```levels(newdf\$region)
 "P1" "P2" "P3" "P4" "P5"```

The original five-factor levels are still there in the new data frame, despite the fact that the region column only has three factors.

If we tried to make any graphs with this data, we’d run into some issues.

The droplevels() function can be used to eliminate the unnecessary factor levels from the region variable:

Remove any unused factor levels.

`newdf\$region <- droplevels(newdf\$region)`

Let’s check now levels of the region variable.

How to make a rounded corner bar plot in R? – Data Science Tutorials

```levels(newdf\$region)
 "P1" "P2" "P3"```

Hurray! Done for the day.

The post droplevels in R with examples appeared first on Data Science Tutorials