Home > Mobile >  pivot_wider() in tidyr without losing columns that are not spread
pivot_wider() in tidyr without losing columns that are not spread

Time:05-04

I know I'm missing something obvious here, but I'm not sure how to spread long-form columns wider using pivot_wider without losing some important columns that I don't want spread.

Toy data

df <- tibble(id = factor(rep(1:2, 
                             each = 3)),
             gender = factor(rep(c("male", "female"), 
                                 each = 3)),
             age = rep(c(45, 32),
                       each = 3),
             time = factor(rep(paste0("week", 1:3), 
                               times = 2)),
             out1 = rnorm(6),
             out2 = factor(sample(letters[1:3],
                                  size = 6,
                                  replace = T)))

df 

# output

# A tibble: 6 x 6
  id    gender   age time     out1 out2 
  <fct> <fct>  <dbl> <fct>   <dbl> <fct>
1 1     male      45 week1 -1.23   c    
2 1     male      45 week2 -0.913  c    
3 1     male      45 week3 -0.267  b    
4 2     female    32 week1 -0.0944 b    
5 2     female    32 week2 -0.147  b    
6 2     female    32 week3 -0.513  c 

So we have the two time-varying columns we want to spread: out1 and out2 and two time-invariant columns (i.e. where values are the same across all time points) that I don't want to spread, but do want to keep in the wider dataset. For spreading out1 and out2 the following works great

df %>%
  pivot_wider(id_cols = id,
              names_from = time,
              values_from = c(out1, out2)) 

# output
# A tibble: 2 x 7
  id    out1_week1 out1_week2 out1_week3 out2_week1 out2_week2 out2_week3
  <fct>      <dbl>      <dbl>      <dbl> <fct>      <fct>      <fct>     
1 1          0.839     1.02         1.08 a          a          a         
2 2          0.420    -0.0687      -2.00 b          a          c 

The spreading of out1 and out2 on time has worked but I have lost the time-invariant variables gender and age. How do I keep these?

Any help appreciated.

CodePudding user response:

df %>%
  pivot_wider(id_cols = id:age,
              names_from = time,
              values_from = c(out1, out2)) 

Result

# A tibble: 2 × 9
  id    gender   age out1_week1 out1_week2 out1_week3 out2_week1 out2_week2 out2_week3
  <fct> <fct>  <dbl>      <dbl>      <dbl>      <dbl> <fct>      <fct>      <fct>     
1 1     male      45     -0.476     -1.46      -0.822 a          c          c         
2 2     female    32     -0.565      0.769     -1.04  c          b          c  
  • Related