Home > Software engineering >  How to make a scatterplot rectangular matrix: (y1, y2, ...) ~ (x1, x2, ...) in R: ggplot2, lattice o
How to make a scatterplot rectangular matrix: (y1, y2, ...) ~ (x1, x2, ...) in R: ggplot2, lattice o

Time:09-27

I'm looking for a way to construct a rectangular matrix of scatterplots of each of several y variables against each of several xs, in a way similar to the base pairs() function, or GGally::ggpairs(), but only for those of y~x.

Ideally, this should support (a) coloring points by groups; (b) adding annotations such as linear regression lines.

Here is a quick example of the basic idea

> data(Rohwer, package = "heplots")
> str(Rohwer)
'data.frame':   69 obs. of  10 variables:
 $ group: int  1 1 1 1 1 1 1 1 1 1 ...
 $ SES  : Factor w/ 2 levels "Hi","Lo": 2 2 2 2 2 2 2 2 2 2 ...
 $ SAT  : int  49 47 11 9 69 35 6 8 49 8 ...
 $ PPVT : int  48 76 40 52 63 82 71 68 74 70 ...
 $ Raven: int  8 13 13 9 15 14 21 8 11 15 ...
 $ n    : int  1 5 0 0 2 2 0 0 0 3 ...
 $ s    : int  2 14 10 2 7 15 1 0 0 2 ...
 $ ns   : int  6 14 21 5 11 21 20 10 7 21 ...
 $ na   : int  12 30 16 17 26 34 23 19 16 26 ...
 $ ss   : int  16 27 16 8 17 25 18 14 13 25 ...

I'll plot 3 ys against 3 xs here:

yvars <- c("SAT", "PPVT", "Raven" )
xvars <- c("n", "s", "ns", "na", "ss")
xvars <- c("n", "s", "ns")  # smaller example
gp <- "SES"

op <- par(mfrow = c(length(yvars), length(xvars)),
          mar = c(4, 4, 1, 1) .1)
for(y in yvars) {
  for (x in xvars) {
    plot(Rohwer[, x], Rohwer[, y],
         xlab=x, ylab=y)
    abline(lm(Rohwer[, y] ~ Rohwer[, x]))
  }
}
par(op)

enter image description here

But of course, I'd like to eliminate all y labels, except in column 1 and row labels except in the last row, as well as be able to add further enhancements to the contents of each plot: different point symbols/colors for groups, separate regression lines, ...

I've looked at base R pairs(), GGally::ggpairs() and lattice::splom() but don't see any way to do what I want.

CodePudding user response:

One option to achieve your result using ggplot2 would be to use GGally::ggmatrix:

library(heplots)

data(Rohwer, package = "heplots")

yvars <- c("SAT", "PPVT", "Raven" )
xvars <- c("n", "s", "ns")
gp <- "SES"

grid <- expand.grid(x = xvars, y = yvars, stringsAsFactors = FALSE)

library(ggplot2)
library(GGally)

plot_fun <- function(x, y) {
  ggplot(Rohwer, aes(.data[[x]], .data[[y]], color = .data[[gp]]))  
    geom_point()  
    geom_smooth(method = "lm", se = FALSE, formula = y ~ x)
}

p <- Map(plot_fun, grid$x, grid$y)

ggmatrix(p, nrow = 3, ncol = 3, xAxisLabels = xvars, yAxisLabels = yvars)

Or as a second option you could convert to long format and use facetting which IMHO makes it easier to customize your plo further:


library(tidyr)

Rohwer_long <- Rohwer %>%
  pivot_longer(cols = all_of(xvars), names_to = "xvar", values_to = "x") %>%
  pivot_longer(cols = all_of(yvars), names_to = "yvar", values_to = "y")

ggplot(Rohwer_long, aes(x, y, color = SES))  
    geom_point()  
    geom_smooth(method = "lm", se = FALSE, formula = y ~ x)  
    facet_grid(yvar ~ xvar, scales = "free")

  • Related