Home > Enterprise >  R Ridgeline plot with multiple PDFs can be overlayed at same level
R Ridgeline plot with multiple PDFs can be overlayed at same level

Time:10-25

How can I create a ridgeline plot where multiple densities can be overlayed at the same ordinate and distinguished by color?

My real dataset is the actual PDF. For a minimum reproducible example, I generate distributions and extract the PDFs to use in a dummy function. The dataframe has a model name (for grouping), x values paired with PDF ordinates, and an id field that separates the different ridgeline levels (i.e., ridgeline y axis).

Make example dataframe

makedfs <- function(name, id, mu, sig) {
  
  vals <- exp(rnorm(1000, mean=mu, sd=sig))
  pdf <-density(vals)
  model <- rep(name, length(pdf$x))
  prox <- rep(id, length(pdf$x))
  df <- data.frame(model, prox, pdf$x, pdf$y)
  colnames(df) <- c("name", "id", "x", "pdf")
  
  return(df)
}

df1 <- makedfs("model1", 0, log(1), 1)
df2 <- makedfs("model2", 0, log(0.5), 2)
df3 <- makedfs("model1", 1, log(0.2), 0.8)
df4 <- makedfs("model2", 1, log(1), 1)

df <- rbind(df1, df2, df3, df4)
head(df,5)

    name id          x          pdf
1 model1  0 -0.6541933 0.0003544569
2 model1  0 -0.5999428 0.0007800386
3 model1  0 -0.5456924 0.0016274229
4 model1  0 -0.4914420 0.0032231582
5 model1  0 -0.4371915 0.0060682580

A quick plot for the first two models looks like this:

plot(df1$x, df1$pdf, type ="l", col=1, xlim=c(-6,6), xlab = "x", ylab = "pdf")
lines(df2$x, df2$pdf, col=2)
legend("topleft", c("df1", "df2"), col = 1:2, lty = 1)

enter image description here

Ridgeline not working

I expected to see the above curves at y=0 on this ridgeline plot, but there is something wrong with the lines and fills for all PDF curves.

library(ggplot2)

p <- ggplot(df, aes(x=x, y=id, height = pdf, group = name, fill = name))  
  geom_ridgeline(alpha = 0.5, scale = 1)  
  scale_y_continuous(limits = c(0, 5))  
  scale_x_continuous(limits = c(-6, 6))

enter image description here

How can I produce the expected ridgeline plot?

CodePudding user response:

IMHO the issue is that you messed up the grouping. Instead of grouping by name you have to group by both name and id using e.g. interaction:

set.seed(123)

library(ggplot2)
library(ggridges)

ggplot(df, aes(x=x, y=id, height = pdf, group = interaction(name, id), fill = name))  
  geom_ridgeline(alpha = 0.5, scale = .5)  
  scale_y_continuous(limits = c(0, 5))  
  scale_x_continuous(limits = c(-6, 6))

  • Related