I like to reshape a time series data from long to wide format , with columns such as StartTime
and StopTime
. All variables measured during the same time interval (StartTime
, StopTime
) to be in the same line.
For example if this is my dataset
Id Time Status Col1
10 2012 4 2
11 2009 2 5
11 2010 2 5
12 2004 2 2
12 2009 2 3
12 2011 2 1
12 2018 2 3
17 2018 2 3
17 2020 2 1
Expecting a dataset like this
Id From To Status Col1
10 2012 2012 4 2
11 2009 2010 2 5
12 2004 2009 2 2
12 2009 2011 2 3
12 2011 2018 2 1
12 2018 2018 2 3
17 2018 2020 2 3
17 2020 2020 2 1
Thanks in advance for the help.
CodePudding user response:
An option would be to create a lead
column after grouping by 'Id'
library(dplyr)
df1 %>%
group_by(Id) %>% mutate(To = if(n() == 1) Time else
lead(Time, default = last(Time)), .before = Status) %>%
ungroup %>%
rename(From = Time) %>%
filter(!is.na(To))
CodePudding user response:
I do not understand why in id 12, there is no 2018-2018.
df %>%
group_by(Id)%>%
mutate(From = Time,To = lead(Time, def = last(Time)),.after=Id, Time = NULL)
# A tibble: 9 × 5
# Groups: Id [4]
Id From To Status Col1
<int> <int> <int> <int> <int>
1 10 2012 2012 4 2
2 11 2009 2010 2 5
3 11 2010 2010 2 5
4 12 2004 2009 2 2
5 12 2009 2011 2 3
6 12 2011 2018 2 1
7 12 2018 2018 2 3
8 17 2018 2020 2 3
9 17 2020 2020 2 1