I have a dataframe of migration data of different countries to Mexico. I want to represent it in a sankey diagram, per year. I am new to R and having a difficult time to produce it. Can someone please help me to achieve this? The destination is Mexico, so all the data flow will be to one point
df
Country 2013 2014 2015 2016 Dest
UK 1200 1200 1207 1400 Mexico
China 630 700 800 940 Mexico
Canada 1000 1000 950 920 Mexico
Brazil 820 670 550 230 Mexico
France 400 200 700 700 Mexico
Australia 440 350 340 780 Mexico
CodePudding user response:
Here is a different way to visualise your data. We use plotly
to create a dynamic animation of the flow of migrants into Mexico over time.
# Need to reshape data from wide to long and prepare data for plotly
data_long <- data %>%
rename(source = Country, target = Dest) %>%
pivot_longer(matches("\\d{4}"), names_to = "year") %>%
pivot_longer(c(source, target), values_to = "country") %>%
mutate(node_id = as.factor(country))
# Plotly Sankey diagrams need a link list that gives 0-based indices
# (instead of node names)
link_list <- data_long %>%
select(-country) %>%
mutate(node_id = as.integer(node_id) - 1) %>%
pivot_wider(names_from = "name", values_from = "node_id") %>%
arrange(year) %>%
as.list()
# Now we're ready to plot
library(plotly)
plot_ly(
type = "sankey",
orientation = "h",
node = list(
label = levels(data_long$node_id),
pad = 15,
thickness = 20),
link = link_list,
frame = ~link_list$year) %>%
animation_slider(currentvalue = list(prefix = "Year: "))
Sample data
data <- read.table(text = "Country 2013 2014 2015 2016 Dest
UK 1200 1200 1207 1400 Mexico
China 630 700 800 940 Mexico
Canada 1000 1000 950 920 Mexico
Brazil 820 670 550 230 Mexico
France 400 200 700 700 Mexico
Australia 440 350 340 780 Mexico", header = TRUE, check.names = FALSE)
CodePudding user response: