Consider this dataframe:
data <- data.frame(ID = rep(1, 6),
Loc = c("A","B","D","A","D","B"),
TimeDiff = c(NA, 4.5,2.2,2.1,3.4,7.2))
We have the same ID
with observations at multiple locations (Loc
). The observations are arranged in the order in which they occurred, so the first observation was at Loc == A
, the second was at Loc == B
, and so on. TimeDiff
is the time period between each consecutive observation. I made the following plot to show the "path" between the Loc
s over time:
library(tidyverse)
data%>%
mutate(RowNumber = row_number(), xend = lead(Loc), yend = lead(RowNumber))%>%
ggplot()
geom_segment(aes(x = Loc, y = RowNumber, xend = xend, yend = yend), arrow = arrow(), size = 2)
My main question: how can we weight the size of each arrow according to the variable TimeDiff
, and how can we label each arrow with the respective value for TimeDiff
? Meaning the arrow connecting the first 2 observations where Loc == A
and Loc == B
will be thicker than the arrow that follows because there was a greater TimeDiff
(4.2) between the two observations.
A side question:
Notice the 3 levels of Loc
include A
, B
, and D
. Assume there is another level C
that I want to be included in the plot between B
and D
. How can this be thrown in there?
CodePudding user response:
Here is a possible solution with slightly modified data.
The drawback is to hard code nudge_x
and nudge_y
:
# modified data NA replaced by 0 and last value replaced by NA as we only have 5 differences in 6 datapoints
data <- data.frame(ID = rep(1, 6),
Loc = c("A","B","D","A","D","B"),
TimeDiff = c(0, 4.5,2.2,2.1,3.4,NA))
library(tidyverse)
data%>%
mutate(RowNumber = row_number(), xend = lead(Loc), yend = lead(RowNumber))%>%
ggplot()
geom_segment(aes(x = Loc, y = RowNumber, xend = xend, yend = yend),
arrow = arrow(), size = data$TimeDiff)
geom_label(aes(x = Loc, y = RowNumber, xend = xend, yend = yend, label = data$TimeDiff),
nudge_x = c(0.3, 0.5, -1, 1, -0.7),
nudge_y = seq(0.2,6, 0.1))
CodePudding user response:
Well, not exactly a pretty figure, but hopefully this is enough to get you started. The key is to put size
inside of the aes
and link it to one of your variables using scale_size_identity()
.
So in this case instead of having size = 2
for all segments, size is controlled by the values in TimeDiff
, e.g. 4.5
, 2.2
, etc. Note I replaced the NA
with 0
for the size
call, and with "NA"
for the label.
library(tidyverse)
dat <- data.frame(ID = rep(1, 6),
Loc = c("A","B","D","A","D","B"),
TimeDiff = c(NA, 4.5,2.2,2.1,3.4,7.2)) %>%
mutate(RowNumber = row_number(),
xend = lead(Loc),
yend = lead(RowNumber))
dat %>%
ggplot(aes(x = Loc, y = RowNumber, xend = xend, yend = yend))
geom_segment(aes(size = replace_na(TimeDiff, 0)), arrow = arrow())
geom_label(aes(label = replace_na(TimeDiff, "NA")))
scale_size_identity()
#> Warning: Removed 1 rows containing missing values (geom_segment).
Created on 2021-09-20 by the reprex package (v2.0.0)