I am making a Sankey diagram with ggalluvial
.
Here is my dataset
library(ggsankey)
library(tidyverse)
df <-
mtcars %>%
make_long(cyl, vs, am, gear, carb) %>%
mutate(color = c(rep("red", 80), rep("blue", 80)))
You can obtain a Sankey diagram like this:
df %>%
ggplot(aes(x = x,
next_x = next_x,
node = node,
next_node = next_node,
fill = factor(node),
label = factor(node)))
geom_sankey()
geom_sankey(flow.alpha = .6,
node.color = "gray30")
geom_sankey_label(size = 3, color = "white", fill = "gray40")
scale_fill_viridis_d()
theme_sankey(base_size = 18)
labs(x = NULL)
theme(legend.position = "none",
plot.title = element_text(hjust = .5))
Now, I want to color the flows between the labels by the column color
of the df
. Is it possible? If not, do you know any other ways to do it in R?
I tried:
df %>%
ggplot(aes(x = x,
next_x = next_x,
node = node,
next_node = next_node,
fill = factor(color),
label = factor(node)))
geom_sankey()
geom_sankey(flow.alpha = .6,
node.color = "gray30")
geom_sankey_label(size = 3, color = "white", fill = "gray40")
scale_fill_viridis_d()
theme_sankey(base_size = 18)
labs(x = NULL)
theme(legend.position = "none",
plot.title = element_text(hjust = .5))
But the plot seems totally broken:
CodePudding user response:
In the end, ggaluvial
seems more adapted to my problem:
Here is the data formating:
df <-
mtcars %>%
select(cyl, vs, am, gear, carb) %>%
mutate(color = c(rep("red", nrow(mtcars)/2), rep("blue", nrow(mtcars)/2)),
id = seq(1:nrow(mtcars))) %>%
pivot_longer(cols = !c(color, id),
names_to = "var",
values_to = "state")
And here is the plot with the correct flow colors:
df %>%
ggplot(aes(x = var,
stratum = state,
label = state,
alluvium = id))
stat_alluvium(aes(fill = color),
width = 0,
alpha = 1,
geom = "flow")
geom_stratum(width = 0.2)
geom_text(stat = "stratum", size = 5, angle = 90)
theme_bw()