Home > Software engineering >  how to give different color for the lines in the Sankey plot to show different groups?
how to give different color for the lines in the Sankey plot to show different groups?

Time:12-05

I have a question on this Sankey plot in R. So basically I want to give different color for the line that connect the source and target nodes based on the variable group. Below are the codes I found from one of the R platform. Essentially the code give you the plot but the connecting line are similar in color. My question is how to give different color for the lines to know that specific group is represented in specific color.

Thank you! Best AD

# Libraries
library(tidyverse)
library(viridis)
library(patchwork)
library(hrbrthemes)
library(circlize)

# Load dataset from github
data <- read.table("https://raw.githubusercontent.com/holtzy/data_to_viz/master/Example_dataset/13_AdjacencyDirectedWeighted.csv", header=TRUE)
# Package
library(networkD3)

# I need a long format
data_long <- data %>%
  rownames_to_column %>%
  gather(key = 'key', value = 'value', -rowname) %>%
  filter(value > 0)
colnames(data_long) <- c("source", "target", "value")
data_long$target <- paste(data_long$target, " ", sep="")
data_long$group <- c(rep("A", 10), rep("B",7), rep("C", 8), rep("D", 10))

# From these flows we need to create a node data frame: it lists every entities involved in the flow
nodes <- data.frame(name=c(as.character(data_long$source), as.character(data_long$target)) %>% unique())

# With networkD3, connection must be provided using id, not using real name like in the links dataframe.. So we need to reformat it.
data_long$IDsource=match(data_long$source, nodes$name)-1 
data_long$IDtarget=match(data_long$target, nodes$name)-1

# Make the Network
sankeyNetwork(Links = data_long, Nodes = nodes,
              Source = "IDsource", Target = "IDtarget",
              Value = "value", NodeID = "name", 
              sinksRight=FALSE, nodeWidth=40, fontSize=13, nodePadding=20)

CodePudding user response:

Following the example from the networkD3::sankeyNetwork documentation you could add a links variable to the data and set the LinkGroup argument...

# Libraries
library(dplyr)
library(tidyr)
library(tibble)
library(networkD3)

# Load dataset from github
data <- read.table("https://raw.githubusercontent.com/holtzy/data_to_viz/master/Example_dataset/13_AdjacencyDirectedWeighted.csv", header=TRUE)

data_long <- 
  data %>%
  rownames_to_column() %>%
  gather(key = 'key', value = 'value', -rowname) %>%
  filter(value > 0)

colnames(data_long) <- c("source", "target", "value")
data_long$target <- paste(data_long$target, " ", sep="")
data_long$group <- c(rep("A", 10), rep("B",7), rep("C", 8), rep("D", 10))

# From these flows we need to create a node data frame: it lists every entities involved in the flow
nodes <- data.frame(name=c(as.character(data_long$source), as.character(data_long$target)) %>% unique())

# With networkD3, connection must be provided using id, not using real name like in the links dataframe.. So we need to reformat it.
data_long$IDsource=match(data_long$source, nodes$name)-1 
data_long$IDtarget=match(data_long$target, nodes$name)-1


# Colour links
data_long$links$source <- sub(' .*', '',
                              data_long$nodes[data_long$links$source   1, 'name'])


# Make the Network
sankeyNetwork(Links = data_long,
              Nodes = nodes,
              Source = "IDsource",
              Target = "IDtarget",
              Value = "value", 
              NodeID = "name", 
              sinksRight=FALSE,
              nodeWidth=40,
              fontSize=13,
              nodePadding=20,
              LinkGroup = 'source')

Created on 2021-12-02 by the reprex package (v2.0.1)

  • Related