Data Visualization with R, In this tutorial, we will describe how to create a scatter plot in the R programming language.
“ggplot2” is a fantastic package for making visually appealing data displays. If you are not already installed, let us do so.
Now we can run the “library” method in order to use “ggplot2”.
Do you have any experience with qplot?
qplot will be useful for building a basic scatter plot. Let’s make a plot using the qplot tool.
In R, we may utilize the built-in dataset “mtcars,” which monitors 11 variables for automobiles from 1973 to 1974.
“mpg” represents miles per gallon, while “wt” stands for weight in this data set. Each output point reflects the miles per gallon and weight of a single automobile.
We can now generate a scatter plot using the “qplot” function.
qplot(mpg, wt, data=mtcars)
However, if you require more customization, you must utilize the “ggplot” package.
The ggplot2 commands differ slightly from the qplot function. The “geom point” attribute, which changes the shape of the dots in the graph, is an important function. Different values for the shape will result in different shapes.
For example, if you set the shape parameter to 19, as seen in the code below, you’ll receive solid circles.
ggplot(mtcars, aes(x=mpg, y=wt))+geom_point(shape=19)
With the code above, we can now add a new variable to our scatter plot, in addition to “miles per gallon” and “weight.” The number of cylinders in a vehicle can be used to categorize it, thus we’ll utilize the geometry of each point to indicate this.
To produce the scatter plot, we must first create a factor from the cylinder variable and then use the ggplot function.
The code appears to be,
cyl_factor<-factor(mtcars$cyl) ggplot(mtcars, aes(x=mpg, y=wt, shape=cyl_factor))+geom_point()
color, as the shape before it, can now be used to represent the third variable.
ggplot(mtcars, aes(x=mpg, y=wt, color=cyl))+geom_point()
Yes, you can see in the output that each point is now colored based on the number of cylinders.
Each hue is represented by a legend on the right. The preceding example used a numerical interpretation of the cylinder count, but it makes more sense to think of the cylinder variable as categorical.
So we’ll specify the color parameter once more, but this time we’ll use the “cylfactor” variable.
ggplot(mtcars, aes(x=mpg, y=wt, color=cyl_factor))+geom_point()
The result shows that the legend indicates a distinct color for each cylinder as a category.
Yes, using the factor approach results in a lot more visually appealing picture.
Let us now update the legend’s label.
ggplot(mtcars, aes(x=mpg, y=wt, color=cyl_factor))+geom_point()+labs(color="Cylinders")
Title & Axis Labels
Using the “xlab” and “ylab” functions, we may rename the axes. We can also define the plot’s title using “ggtitle.”
ggplot(mtcars, aes(x=mpg, y=wt, color=cyl_factor))+geom_point()+labs(color="Cylinders")+xlab("MPG")+ylab("wt")+ggtitle("ScatterPlot")
You can see how these changes might appear on the scatter plot.