ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (2024)

  • Prepare the data
  • Basic scatter plots
  • Label points in the scatter plot
    • Add regression lines
    • Change the appearance of points and lines
  • Scatter plots with multiple groups
    • Change the point color/shape/size automatically
    • Add regression lines
    • Change the point color/shape/size manually
  • Add marginal rugs to a scatter plot
  • Scatter plots with the 2d density estimation
  • Scatter plots with ellipses
  • Scatter plots with rectangular bins
  • Scatter plot with marginal density distribution plot
  • Customized scatter plots
  • Infos

This article describes how create a scatter plot using R software and ggplot2 package. The function geom_point() is used.

ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (1)


mtcars data sets are used in the examples below.

# Convert cyl column from a numeric to a factor variablemtcars$cyl <- as.factor(mtcars$cyl)head(mtcars)
## mpg cyl disp hp drat wt qsec vs am gear carb## Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4## Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4## Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1## Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1## Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2## Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1

Simple scatter plots are created using the R code below. The color, the size and the shape of points can be changed using the function geom_point() as follow :

geom_point(size, color, shape)
library(ggplot2)# Basic scatter plotggplot(mtcars, aes(x=wt, y=mpg)) + geom_point()# Change the point size, and shapeggplot(mtcars, aes(x=wt, y=mpg)) + geom_point(size=2, shape=23)

ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (3)ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (4)

Note that, the size of the points can be controlled by the values of a continuous variable as in the example below.

# Change the point sizeggplot(mtcars, aes(x=wt, y=mpg)) + geom_point(aes(size=qsec))

ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (5)

Read more on point shapes : ggplot2 point shapes

The function geom_text() can be used :

ggplot(mtcars, aes(x=wt, y=mpg)) + geom_point() + geom_text(label=rownames(mtcars))

ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (6)

Read more on text annotations : ggplot2 - add texts to a plot

Add regression lines

The functions below can be used to add regression lines to a scatter plot :

  • geom_smooth() and stat_smooth()
  • geom_abline()

geom_abline() has been already described at this link : ggplot2 add straight lines to a plot.

Only the function geom_smooth() is covered in this section.

A simplified format is :

geom_smooth(method="auto", se=TRUE, fullrange=FALSE, level=0.95)

  • method : smoothing method to be used. Possible values are lm, glm, gam, loess, rlm.
    • method = “loess”: This is the default value for small number of observations. It computes a smooth local regression. You can read more about loess using the R code ?loess.
    • method =“lm”: It fits a linear model. Note that, it’s also possible to indicate the formula as formula = y ~ poly(x, 3) to specify a degree 3 polynomial.
  • se : logical value. If TRUE, confidence interval is displayed around smooth.
  • fullrange : logical value. If TRUE, the fit spans the full range of the plot
  • level : level of confidence interval to use. Default value is 0.95
# Add the regression lineggplot(mtcars, aes(x=wt, y=mpg)) + geom_point()+ geom_smooth(method=lm)# Remove the confidence intervalggplot(mtcars, aes(x=wt, y=mpg)) + geom_point()+ geom_smooth(method=lm, se=FALSE)# Loess methodggplot(mtcars, aes(x=wt, y=mpg)) + geom_point()+ geom_smooth()

ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (7)ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (8)ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (9)

Change the appearance of points and lines

This section describes how to change :

  • the color and the shape of points
  • the line type and color of the regression line
  • the fill color of the confidence interval
# Change the point colors and shapes# Change the line type and colorggplot(mtcars, aes(x=wt, y=mpg)) + geom_point(shape=18, color="blue")+ geom_smooth(method=lm, se=FALSE, linetype="dashed", color="darkred")# Change the confidence interval fill colorggplot(mtcars, aes(x=wt, y=mpg)) + geom_point(shape=18, color="blue")+ geom_smooth(method=lm, linetype="dashed", color="darkred", fill="blue")

ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (10)ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (11)

Note that a transparent color is used, by default, for the confidence band. This can be changed by using the argument alpha : geom_smooth(fill=“blue”, alpha=1)

Read more on point shapes : ggplot2 point shapes

Read more on line types : ggplot2 line types

This section describes how to change point colors and shapes automatically and manually.

Change the point color/shape/size automatically

In the R code below, point shapes, colors and sizes are controlled by the levels of the factor variable cyl :

# Change point shapes by the levels of cylggplot(mtcars, aes(x=wt, y=mpg, shape=cyl)) + geom_point()# Change point shapes and colorsggplot(mtcars, aes(x=wt, y=mpg, shape=cyl, color=cyl)) + geom_point()# Change point shapes, colors and sizesggplot(mtcars, aes(x=wt, y=mpg, shape=cyl, color=cyl, size=cyl)) + geom_point()

ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (12)ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (13)ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (14)

Add regression lines

Regression lines can be added as follow :

# Add regression linesggplot(mtcars, aes(x=wt, y=mpg, color=cyl, shape=cyl)) + geom_point() + geom_smooth(method=lm)# Remove confidence intervals# Extend the regression linesggplot(mtcars, aes(x=wt, y=mpg, color=cyl, shape=cyl)) + geom_point() + geom_smooth(method=lm, se=FALSE, fullrange=TRUE)

ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (15)ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (16)

Note that, you can also change the line type of the regression lines by using the aesthetic linetype = cyl.

The fill color of confidence bands can be changed as follow :

ggplot(mtcars, aes(x=wt, y=mpg, color=cyl, shape=cyl)) + geom_point() + geom_smooth(method=lm, aes(fill=cyl))

ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (17)

Change the point color/shape/size manually

The functions below are used :

  • scale_shape_manual() for point shapes
  • scale_color_manual() for point colors
  • scale_size_manual() for point sizes
# Change point shapes and colors manuallyggplot(mtcars, aes(x=wt, y=mpg, color=cyl, shape=cyl)) + geom_point() + geom_smooth(method=lm, se=FALSE, fullrange=TRUE)+ scale_shape_manual(values=c(3, 16, 17))+ scale_color_manual(values=c('#999999','#E69F00', '#56B4E9'))+ theme(legend.position="top") # Change the point sizes manuallyggplot(mtcars, aes(x=wt, y=mpg, color=cyl, shape=cyl))+ geom_point(aes(size=cyl)) + geom_smooth(method=lm, se=FALSE, fullrange=TRUE)+ scale_shape_manual(values=c(3, 16, 17))+ scale_color_manual(values=c('#999999','#E69F00', '#56B4E9'))+ scale_size_manual(values=c(2,3,4))+ theme(legend.position="top")

ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (18)ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (19)

It is also possible to change manually point and line colors using the functions :

  • scale_color_brewer() : to use color palettes from RColorBrewer package
  • scale_color_grey() : to use grey color palettes
p <- ggplot(mtcars, aes(x=wt, y=mpg, color=cyl, shape=cyl)) + geom_point() + geom_smooth(method=lm, se=FALSE, fullrange=TRUE)+ theme_classic()# Use brewer color palettesp+scale_color_brewer(palette="Dark2")# Use grey scalep + scale_color_grey()

ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (20)ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (21)

Read more on ggplot2 colors here : ggplot2 colors

The function geom_rug() can be used :

geom_rug(sides ="bl")

sides : a string that controls which sides of the plot the rugs appear on. Allowed value is a string containing any of “trbl”, for top, right, bottom, and left.

# Add marginal rugsggplot(mtcars, aes(x=wt, y=mpg)) + geom_point() + geom_rug()# Change colorsggplot(mtcars, aes(x=wt, y=mpg, color=cyl)) + geom_point() + geom_rug()# Add marginal rugs using faithful dataggplot(faithful, aes(x=eruptions, y=waiting)) + geom_point() + geom_rug()

ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (22)ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (23)ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (24)

The functions geom_density_2d() or stat_density_2d() can be used :

# Scatter plot with the 2d density estimationsp <- ggplot(faithful, aes(x=eruptions, y=waiting)) + geom_point()sp + geom_density_2d()# Gradient colorsp + stat_density_2d(aes(fill = ..level..), geom="polygon")# Change the gradient colorsp + stat_density_2d(aes(fill = ..level..), geom="polygon")+ scale_fill_gradient(low="blue", high="red")

ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (25)ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (26)ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (27)

Read more on ggplot2 colors here : ggplot2 colors

The function stat_ellipse() can be used as follow:

# One ellipse arround all pointsggplot(faithful, aes(waiting, eruptions))+ geom_point()+ stat_ellipse()# Ellipse by groupsp <- ggplot(faithful, aes(waiting, eruptions, color = eruptions > 3))+ geom_point()p + stat_ellipse()# Change the type of ellipses: possible values are "t", "norm", "euclid"p + stat_ellipse(type = "norm")

ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (28)ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (29)ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (30)

The number of observations is counted in each bins and displayed using any of the functions below :

  • geom_bin2d() for adding a heatmap of 2d bin counts
  • stat_bin_2d() for counting the number of observation in rectangular bins
  • stat_summary_2d() to apply function for 2D rectangular bins

The simplified formats of these functions are :

plot + geom_bin2d(...)plot+stat_bin_2d(geom=NULL, bins=30)plot + stat_summary_2d(geom = NULL, bins = 30, fun = mean)
  • geom : geometrical object to display the data
  • bins : Number of bins in both vertical and horizontal directions. The default value is 30
  • fun : function for summary

The data sets diamonds from ggplot2 package is used :

head(diamonds)
## carat cut color clarity depth table price x y z## 1 0.23 Ideal E SI2 61.5 55 326 3.95 3.98 2.43## 2 0.21 Premium E SI1 59.8 61 326 3.89 3.84 2.31## 3 0.23 Good E VS1 56.9 65 327 4.05 4.07 2.31## 4 0.29 Premium I VS2 62.4 58 334 4.20 4.23 2.63## 5 0.31 Good J SI2 63.3 58 335 4.34 4.35 2.75## 6 0.24 Very Good J VVS2 62.8 57 336 3.94 3.96 2.48
# Plotp <- ggplot(diamonds, aes(carat, price))p + geom_bin2d()

ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (31)

Change the number of bins :

# Change the number of binsp + geom_bin2d(bins=10)

ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (32)

Or specify the width of bins :

# Or specify the width of binsp + geom_bin2d(binwidth=c(1, 1000))

ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (33)

Step 1/3. Create some data :

set.seed(1234)x <- c(rnorm(500, mean = -1), rnorm(500, mean = 1.5))y <- c(rnorm(500, mean = 1), rnorm(500, mean = 1.7))group <- as.factor(rep(c(1,2), each=500))df <- data.frame(x, y, group)head(df)
## x y group## 1 -2.20706575 -0.2053334 1## 2 -0.72257076 1.3014667 1## 3 0.08444118 -0.5391452 1## 4 -3.34569770 1.6353707 1## 5 -0.57087531 1.7029518 1## 6 -0.49394411 -0.9058829 1

Step 2/3. Create the plots :

# scatter plot of x and y variables# color by groupsscatterPlot <- ggplot(df,aes(x, y, color=group)) + geom_point() + scale_color_manual(values = c('#999999','#E69F00')) + theme(legend.position=c(0,1), legend.justification=c(0,1))scatterPlot# Marginal density plot of x (top panel)xdensity <- ggplot(df, aes(x, fill=group)) + geom_density(alpha=.5) + scale_fill_manual(values = c('#999999','#E69F00')) + theme(legend.position = "none")xdensity# Marginal density plot of y (right panel)ydensity <- ggplot(df, aes(y, fill=group)) + geom_density(alpha=.5) + scale_fill_manual(values = c('#999999','#E69F00')) + theme(legend.position = "none")ydensity

ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (34)ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (35)ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (36)

Create a blank placeholder plot :

blankPlot <- ggplot()+geom_blank(aes(1,1))+ theme(plot.background = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), panel.border = element_blank(), panel.background = element_blank(), axis.title.x = element_blank(), axis.title.y = element_blank(), axis.text.x = element_blank(), axis.text.y = element_blank(), axis.ticks = element_blank() )

Step 3/3. Put the plots together:

To put multiple plots on the same page, the package gridExtra can be used. Install the package as follow :

install.packages("gridExtra")

Arrange ggplot2 with adapted height and width for each row and column :

library("gridExtra")grid.arrange(xdensity, blankPlot, scatterPlot, ydensity, ncol=2, nrow=2, widths=c(4, 1.4), heights=c(1.4, 4))

ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (37)

Read more on how to arrange multiple ggplots in one page : ggplot2 - Easy way to mix multiple graphs on the same page

# Basic scatter plotggplot(mtcars, aes(x=wt, y=mpg)) + geom_point()+ geom_smooth(method=lm, color="black")+ labs(title="Miles per gallon \n according to the weight", x="Weight (lb/1000)", y = "Miles/(US) gallon")+ theme_classic() # Change color/shape by groups# Remove confidence bandsp <- ggplot(mtcars, aes(x=wt, y=mpg, color=cyl, shape=cyl)) + geom_point()+ geom_smooth(method=lm, se=FALSE, fullrange=TRUE)+ labs(title="Miles per gallon \n according to the weight", x="Weight (lb/1000)", y = "Miles/(US) gallon")p + theme_classic() 

ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (38)ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (39)

Change colors manually :

# Continuous colorsp + scale_color_brewer(palette="Paired") + theme_classic()# Discrete colorsp + scale_color_brewer(palette="Dark2") + theme_minimal()# Gradient colorsp + scale_color_brewer(palette="Accent") + theme_minimal()

ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (40)ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (41)ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (42)

Read more on ggplot2 colors here : ggplot2 colors

This analysis has been performed using R software (ver. 3.2.4) and ggplot2 (ver. 2.1.0)

ggplot2 scatter plots : Quick start guide - R software and data visualization - Easy Guides - Wiki (2024)
Top Articles
Latest Posts
Article information

Author: Rev. Leonie Wyman

Last Updated:

Views: 6269

Rating: 4.9 / 5 (79 voted)

Reviews: 86% of readers found this page helpful

Author information

Name: Rev. Leonie Wyman

Birthday: 1993-07-01

Address: Suite 763 6272 Lang Bypass, New Xochitlport, VT 72704-3308

Phone: +22014484519944

Job: Banking Officer

Hobby: Sailing, Gaming, Basketball, Calligraphy, Mycology, Astronomy, Juggling

Introduction: My name is Rev. Leonie Wyman, I am a colorful, tasty, splendid, fair, witty, gorgeous, splendid person who loves writing and wants to share my knowledge and understanding with you.