Home > Software design >  Heatmap with ggbump plots - color coding number of paths taken
Heatmap with ggbump plots - color coding number of paths taken

Time:01-06

Good Morning, The aim of this plot exercise is to have a [ggbump style plot] (https://github.com/davidsjoberg/ggbump) but with many people travelling along the route from the entry to exit nodes. E.g., imagine each node (rows in the plot) are cities and we want to see how many people take each route along possible paths from the entry to exit nodes. Therefore, this would be a modification of the ggbump figure, but with darker colours for the paths taken more often.

An example can be seen in the following data, imagining there are 10 possible cities (integers 1:10) and people can make 4 stops along the way (cols 2:5). Col1 = user ID.

So if we had 100 people (rows), and here I have made only 5 possible paths to reduce the computational complexity thanks to a comment below by @tjebo. The data would look as following:

set.seed(123)  # set seed to ensure reproducibility
df <- data.frame(matrix(ncol = 5, nrow = 100))
# assign column names to the data frame
colnames(df) <- c("User ID", "Stop 1", "Stop 2", "Stop 3", "Stop 4")

# create a list of vectors representing the 5 possible paths thanks to @tjebo's comment
paths <- list(c(1, 2, 3, 4), c(4, 4, 4, 3), c(3, 3, 2, 4), c(1, 3, 4, 2), c(1, 4, 2, 1))

# create a vector of frequencies for each path
path_frequencies <- c(50, 30, 10, 5, 5)

# loop through each row of the data frame
for (i in 1:nrow(df)) {
  # assign the user ID for the current row
  df[i, "User ID"] <- i
  # generate a random sample of integers from 1 to 5 based on the path frequencies
  sample <- sample(1:5, size = 1, replace = TRUE, prob = path_frequencies / sum(path_frequencies))
  # select the path corresponding to the chosen integer
  path <- paths[[sample]]
  # assign the path to the remaining columns of the current row
  df[i, 2:5] <- path
}

Then we would need to modify the following ggbump code to set the aes() as darker / thicker lines based on how many people to that route (similar to lavaan plot paths). Depending on the solution, might need to change format before plotting. (group_by then mutate.)

ggplot(df, aes(each stop, city (1:10), color = density function for path usage))  
    geom_bump()

Perhaps writing a seperate function to count how many times each possible path is plotted, then pass this to ggplot as an index of the line opacity?

Kind Regards Conal

CodePudding user response:

Actually I don't think you were very far. It's not that difficult - you need to count your unique combinations (see below for one very easy way) and then you can use this count as a continuous aesthetic. In this case below, both line width and color represent the count.

Other comments in the code.

library(tidyverse)
library(ggbump)

set.seed(123)  
df <- data.frame(matrix(ncol = 5, nrow = 100))

## I've changed the names to make coding easier
colnames(df) <- c("UserID", paste0("Stop", 1:4))
paths <- list(c(1, 2, 3, 4), c(4, 4, 4, 3), c(3, 3, 2, 4), c(1, 3, 4, 2), c(1, 4, 2, 1))
path_frequencies <- c(50, 30, 10, 5, 5)

for (i in 1:nrow(df)) {
  df[i, "UserID"] <- i
  sample <- sample(1:5, size = 1, replace = TRUE, prob = path_frequencies / sum(path_frequencies))
  path <- paths[[sample]]
  df[i, 2:5] <- path
}
## You need to shape long, but first you need to assign an identifying ID
## for each unique combination
## you can then use this count as continuous aesthetic for line width, color, etc 
df_long <- 
  df %>% 
  select(-UserID) %>%
  count(across(starts_with("stop"))) %>%
  tidyr::pivot_longer(- n) 
  
## I am using n as both line width and color aesthetic
## I find color easier to understand/quantify than line width. 
## you also need to specify the group
ggplot(df_long, aes(name, value, color = n, group = n))  
  ## use the count as width
  geom_bump(aes(linewidth = n))  
  ## just a random palette which I like in this case. There are plenty plenty
  scico::scale_colour_scico(palette = "acton", direction = -1)

Created on 2023-01-04 with reprex v2.0.2

  • Related