Home > Mobile >  Is there a way to add a vertical line to a ridgeline plot that is not a quartile or a fixed % value
Is there a way to add a vertical line to a ridgeline plot that is not a quartile or a fixed % value

Time:08-24

I did an analysis on my fantasy baseball league, where I had every team play every other team's schedules, to determine their schedule variance and who maybe is luckier than other's based on the schedule. I am plotting the number of wins vs each schedule with a ridgeline plot. I would like to add a vertical line, that only is within a team's specific ridge (impacting no other ridge visually), that shows what their actual number of wins is. I have been able to add a point that shows their actual win total, or a vertical line that goes through the entire visual, or a line at the mean or other quartiles, but not what I am looking for.

dput(BoxPlotData)
structure(list(Team = c("Alex", "Alex", "Alex", "Alex", "Alex", 
"Alex", "Alex", "Alex", "Alex", "Alex", "Alex", "Alex", "Allan", 
"Allan", "Allan", "Allan", "Allan", "Allan", "Allan", "Allan", 
"Allan", "Allan", "Allan", "Allan", "Carter", "Carter", "Carter", 
"Carter", "Carter", "Carter", "Carter", "Carter", "Carter", "Carter", 
"Carter", "Carter", "Derek", "Derek", "Derek", "Derek", "Derek", 
"Derek", "Derek", "Derek", "Derek", "Derek", "Derek", "Derek", 
"Eddie", "Eddie", "Eddie", "Eddie", "Eddie", "Eddie", "Eddie", 
"Eddie", "Eddie", "Eddie", "Eddie", "Eddie", "Hammy", "Hammy", 
"Hammy", "Hammy", "Hammy", "Hammy", "Hammy", "Hammy", "Hammy", 
"Hammy", "Hammy", "Hammy", "Jared", "Jared", "Jared", "Jared", 
"Jared", "Jared", "Jared", "Jared", "Jared", "Jared", "Jared", 
"Jared", "Josh", "Josh", "Josh", "Josh", "Josh", "Josh", "Josh", 
"Josh", "Josh", "Josh", "Josh", "Josh", "Niska", "Niska", "Niska", 
"Niska", "Niska", "Niska", "Niska", "Niska", "Niska", "Niska", 
"Niska", "Niska", "RyanC", "RyanC", "RyanC", "RyanC", "RyanC", 
"RyanC", "RyanC", "RyanC", "RyanC", "RyanC", "RyanC", "RyanC", 
"RyanD", "RyanD", "RyanD", "RyanD", "RyanD", "RyanD", "RyanD", 
"RyanD", "RyanD", "RyanD", "RyanD", "RyanD", "Urby", "Urby", 
"Urby", "Urby", "Urby", "Urby", "Urby", "Urby", "Urby", "Urby", 
"Urby", "Urby"), WinTotal = c(9, 10, 8, 6, 5, 9, 11, 8, 11, 9, 
8, 8, 14, 13, 14, 12, 12, 15, 15, 13, 14, 12, 12, 16, 1, 1, 1, 
1, 2, 2, 4, 3, 0, 2, 5, 0, 8, 9, 7, 6, 6, 8, 11, 5, 10, 7, 4, 
5, 8, 5, 6, 9, 7, 6, 11, 8, 9, 4, 7, 7, 12, 8, 9, 8, 6, 8, 11, 
9, 8, 8, 7, 9, 10, 7, 7, 10, 6, 9, 9, 8, 9, 6, 5, 10, 8, 12, 
7, 11, 7, 10, 14, 9, 13, 9, 9, 12, 11, 14, 14, 11, 11, 11, 14, 
9, 15, 12, 10, 13, 13, 12, 10, 16, 13, 16, 14, 12, 12, 14, 12, 
15, 6, 4, 4, 3, 5, 5, 8, 3, 3, 2, 4, 7, 13, 13, 13, 13, 13, 14, 
16, 13, 14, 14, 11, 13), W.x = c(9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L, 9L, 9L, 9L, 9L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 
13L, 13L, 13L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 
8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 15L, 15L, 15L, 
15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 14L, 14L, 14L, 14L, 
14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 
13L, 13L, 13L, 13L), W.y = c(9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L, 9L, 9L, 9L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 
13L, 13L, 13L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 
8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 15L, 15L, 15L, 
15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 14L, 14L, 14L, 14L, 
14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 
13L, 13L, 13L, 13L)), row.names = c(NA, -144L), class = "data.frame")


ggplot(BoxPlotData, aes(x = WinTotal,y = Team, fill = Team)) 
  geom_density_ridges(scale=1) 
  #facet_wrap(~Team) 
  theme_ridges() 
  theme(legend.position = "none") 
  #geom_vline(aes(xintercept=W,linetype = Team), data = ActualWins)
  geom_point(shape=18,size = 4,data = BoxPlotData, aes(x=W,y=Team,fill="black"))

I have commented out a couple attempts to do display it while doing facet_wrap - unhelpful because the ridges stay where they were on the large visual; geom_vline - lines go through the entire set of data.

CodePudding user response:

It is actually possible to do this all with some data manipulation. Pre-calculate the densities and use geom_ridgeline:

BoxPlotData %>%
  group_by(Team) %>%
  summarize(dens = density(WinTotal, from = -2, to = 20, bw = 1, n = 441)$y,
            WinTotal = density(WinTotal, from = -2, to = 20, n = 441)$x,
            is.actual = WinTotal %in% W.x, .groups = "drop") %>%
  mutate(yval = as.numeric(as.factor(Team))) %>%
  ggplot(aes(x = WinTotal, y = Team, fill = Team))  
  geom_ridgeline(aes(height = dens), scale = 3)  
  theme_ridges()  
  theme(legend.position = "none")  
  geom_segment(data = . %>% filter(is.actual),
               aes(y = yval, yend = yval   3 * dens, xend = WinTotal))

enter image description here

CodePudding user response:

It is a bit tricky, but what you can do is collect the data behind your first plot without vertical lines using ggplot_build which gives you information about the density per team. The position per line is "W.x" per team and the height of the line can determined to get the max density per team of your geom_density_ridges call. Instead of using geom_vline, you could use geom_segment to create a line per team. Here is a reproducible example:

library(ggplot2)
library(ggridges)
library(dplyr)
# Create dataframe with only one value per team and give each team a number
ActualWins <- BoxPlotData %>% distinct(Team, W.x) %>% mutate(number = row_number())

# Create first the plot to save the data
p <- ggplot(BoxPlotData, aes(x = WinTotal,y = Team, fill = Team)) 
  geom_density_ridges(scale=1) 
  theme_ridges() 
  theme(legend.position = "none") 
  geom_point(shape=18,size = 4,data = BoxPlotData, aes(x=W.x,y=Team,fill="black")) 

# Collect data
q <- ggplot_build(p)$data[[1]]
#> Picking joint bandwidth of 0.803

# Select the highest points which are the most wins
density_lines <- q %>%
  group_by(group) %>% 
  filter(density == max(density)) %>% 
  ungroup()

# Join data with Actualwins
density_lines_complete <- left_join(density_lines, ActualWins, by = c("group" = "number"))

# Create Plot with point
ggplot(BoxPlotData, aes(x = WinTotal,y = Team, fill = Team)) 
  geom_density_ridges(scale=1) 
  theme_ridges() 
  theme(legend.position = "none") 
  geom_segment(data = density_lines_complete, aes(x = W.x, xend = W.x, y = ymin, yend = ymin density*scale*iscale), color = "red")  
  geom_point(shape=18,size = 4,data = BoxPlotData, aes(x=W.x,y=Team,fill="black"))
#> Picking joint bandwidth of 0.803


# Create plot without points
ggplot(BoxPlotData, aes(x = WinTotal,y = Team, fill = Team)) 
  geom_density_ridges(scale=1) 
  theme_ridges() 
  theme(legend.position = "none") 
  geom_segment(data = density_lines_complete, aes(x = W.x, xend = W.x, y = ymin, yend = ymin density*scale*iscale), color = "red") 
#> Picking joint bandwidth of 0.803

Created on 2022-08-23 with reprex v2.0.2

  • Related