These sample dataframes have data on the two segments of a piecewise regression line. In "df", segment_start is the beginning of the segment, and segment_end is the end of the segment. There is a data point for years 2000-2010. "value" is the slope of the segment. The first segment goes from 2000-2006 and the second from 2007-2010. What I'm trying to do is combine the rows for "obs" 1 into a single row, with all data points, 2000-2010, included, as shown in "df2". Is there a code to automate this process? Thank you for any ideas in advance.
#data I have:
df <- tibble("obs" = 1:1,
"segment"=c(1,2),
"segment_start"=c(2000,2006),
"segment_end"=c(2006, 2010),
"value"=c(0.5, 1.5))
df
#data I want:
df2 <- tibble("obs"=1,
"2000"=0.5,
"2001"=0.5,
"2002"=0.5,
"2003"=0.5,
"2004"=0.5,
"2005"=0.5,
"2006"=0.5,
"2007"=1.5,
"2008"=1.5,
"2009"=1.5,
"2010"=1.5)
df2
CodePudding user response:
Assuming you have non-overlapping regions such as
df <- tibble("obs" = 1:1,
"segment"=c(1,2),
"segment_start"=c(2000, 2007),
"segment_end"=c(2006, 2010),
"value"=c(0.5, 1.5))
Then you can use rowise
with mutate()
to create a sequence for each row. Then ou use tidyr::unnest_longer
to expand those into rows. Finally you use tidyr::pivot_wider
to turn those rows into columns
df %>%
rowwise() %>%
mutate(year = list(segment_start:segment_end)) %>%
unnest_longer(year) %>%
pivot_wider(obs, names_from=year, values_from=value)
It's just important to be consistent about the start/ends and whether or not each interval end value is open or closed.
CodePudding user response:
With the original dataframe, the expected output can be done with complete
fill
pivot_wider
.
library(tidyverse)
df %>%
complete(segment_start = full_seq(c(min(segment_start), max(segment_end)), 1)) %>%
fill(obs, segment, value) %>%
pivot_wider(-c(segment, segment_end), names_from = segment_start)
# A tibble: 1 x 12
obs `2000` `2001` `2002` `2003` `2004` `2005` `2006` `2007` `2008` `2009` `2010`
<int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 0.5 0.5 0.5 0.5 0.5 0.5 1.5 1.5 1.5 1.5 1.5