I know I'm missing something obvious here, but I'm not sure how to spread long-form columns wider using pivot_wider
without losing some important columns that I don't want spread.
Toy data
df <- tibble(id = factor(rep(1:2,
each = 3)),
gender = factor(rep(c("male", "female"),
each = 3)),
age = rep(c(45, 32),
each = 3),
time = factor(rep(paste0("week", 1:3),
times = 2)),
out1 = rnorm(6),
out2 = factor(sample(letters[1:3],
size = 6,
replace = T)))
df
# output
# A tibble: 6 x 6
id gender age time out1 out2
<fct> <fct> <dbl> <fct> <dbl> <fct>
1 1 male 45 week1 -1.23 c
2 1 male 45 week2 -0.913 c
3 1 male 45 week3 -0.267 b
4 2 female 32 week1 -0.0944 b
5 2 female 32 week2 -0.147 b
6 2 female 32 week3 -0.513 c
So we have the two time-varying columns we want to spread: out1
and out2
and two time-invariant columns (i.e. where values are the same across all time points) that I don't want to spread, but do want to keep in the wider dataset. For spreading out1
and out2
the following works great
df %>%
pivot_wider(id_cols = id,
names_from = time,
values_from = c(out1, out2))
# output
# A tibble: 2 x 7
id out1_week1 out1_week2 out1_week3 out2_week1 out2_week2 out2_week3
<fct> <dbl> <dbl> <dbl> <fct> <fct> <fct>
1 1 0.839 1.02 1.08 a a a
2 2 0.420 -0.0687 -2.00 b a c
The spreading of out1
and out2
on time
has worked but I have lost the time-invariant variables gender
and age
. How do I keep these?
Any help appreciated.
CodePudding user response:
df %>%
pivot_wider(id_cols = id:age,
names_from = time,
values_from = c(out1, out2))
Result
# A tibble: 2 × 9
id gender age out1_week1 out1_week2 out1_week3 out2_week1 out2_week2 out2_week3
<fct> <fct> <dbl> <dbl> <dbl> <dbl> <fct> <fct> <fct>
1 1 male 45 -0.476 -1.46 -0.822 a c c
2 2 female 32 -0.565 0.769 -1.04 c b c