How to Use the scale() Function in R

[This article was first published on Methods – finnstats, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Visit for the most up-to-date information on Data Science, employment, and tutorials finnstats.

If you want to read the original article, go here How to Use the scale() Function in R

Scale() Function in R, Scaling is a technique for comparing data that isn’t measured in the same way. The normalizing of a dataset using the mean value and standard deviation is known as scaling.

When working with vectors or columns in a data frame, scaling is frequently employed.

In R, you can use the scale() function to scale the values in a vector, matrix, or data frame.

What is statistical data? Functions, Methods, and Types » finnstats

You will almost always receive meaningless results if you do not normalize the vectors or columns you are utilizing.

Scale() is a built-in R function that centers and/or scales the columns of a numeric matrix by default.

Only if the value provided is numeric, the scale() function subtracts the values of each column by the matching “center” value from the argument.

The following is the fundamental syntax for this function:

scale(x, center = TRUE, scale = TRUE)


x: Name of the scaled object

center: When scaling, whether the mean should be subtracted. TRUE is the default value.

scale: When scaling, whether to divide by the standard deviation. TRUE is the default value.

This function uses the following formula to calculate scaled values.

Deep Neural Network in R » Keras & Tensor Flow finnstats

xscaled = (x – x̄) / s


x: real  x-value

x̄: Sample mean

s: Sample SD

This is also known as data standardization, and it basically involves converting each original value into a z-score.

If the value is numeric, the scale() method divides the values of each column by the corresponding scale value from the input.

Otherwise, the standard deviation or root-mean-square values are used to split the numbers.

The examples below demonstrate how to utilize this function in practice.

Example 1: Scale the Values in a Vector

Assume we have the following value vector in R.

x <- c(11, 12, 13,24, 25, 16, 17, 18, 19)

look at the average and standard deviation of the data


[1] 17.22222


[1] 4.944132

The scale() function is used to scale the values in the vector in the following code.

Compare data frames in R-Quick Guide » finnstats

x values should be scaled

x_scaled <- scale(x)

Let’s view the scaled values

 [1,] -1.25850641
 [2,] -1.05624645
 [3,] -0.85398649
 [4,]  1.37087305
 [5,]  1.57313301
 [6,] -0.24720662
 [7,] -0.04494666
 [8,]  0.15731330
 [9,]  0.35957326
[1] 17.22222
[1] 4.944132

If you center the data while scaling a vector, you will receive negative numbers. When comparing vectors, it reduces the effect of a different scale, bringing it closer to a normal distribution.

This type of normalization is useful when comparing proposed data from multiple measures.

Basic Functions in R » Function is a block of code » finnstats

It’s worth noting that if we supplied scale=FALSE, the function would not have split by the standard deviation when scaling:

Don’t divide by standard deviation when scaling x values.

x_scaled <- scale(x, scale = FALSE)
 [1,] -6.2222222
 [2,] -5.2222222
 [3,] -4.2222222
 [4,]  6.7777778
 [5,]  7.7777778
 [6,] -1.2222222
 [7,] -0.2222222
 [8,]  0.7777778
 [9,]  1.7777778
[1] 17.22222

Example 2: Scale the Column Values in a Data Frame

When we want to scale the values in several columns of a data frame so that each column has a mean of 0 and a standard deviation of 1, we usually use the scale() function.

How to calculate Scheffe’s Test in R » finnstats

As an example, consider the following data frame in R:

data <- data.frame(x=c(11, 12, 23, 24, 25, 66, 77, 18, 9),
                 y=c(60, 80, 90, 10, 5, 6, 700, 180, 190))
  x   y
1 11  60
2 12  80
3 23  90
4 24  10
5 25   5
6 66   6
7 77 700
8 18 180
9  9 190
df_scaled <- scale(data)
  x   y
1 11  60
2 12  80
3 23  90
4 24  10
5 25   5
6 66   6
7 77 700
8 18 180
9  9 190

The y variable’s range of values is significantly larger than the x variable’s range of values.

The scale() method can be used to scale the values in both columns so that the scaled values of x and y have the same mean and standard deviation.

The x and y columns now have the same mean of 0 and standard deviation of 1.

Anderson-Darling Test in R (Quick Normality Check) » finnstats


With the default settings, the scale() function calculates the vector’s mean and standard deviation, then “scales” each element by removing the mean and dividing by the sd.

When you have several variables to examine over multiple scales, the scale() function makes more sense. One variable, for example, is of magnitude 100, whereas another is of magnitude 1000.

The scale serves no purpose other than to standardize the data. The values it generates are known by a variety of names, one of which being z-scores.

Cluster Analysis in R » Unsupervised Approach » finnstats

Subscribe to our newsletter!

Don't forget to express your happiness by leaving a comment.
How to Use the scale() Function in R.

The post How to Use the scale() Function in R appeared first on finnstats.

To leave a comment for the author, please follow the link and comment on their blog: Methods – finnstats. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)