Home > database >  Reshaping from long to wide with multiple columns
Reshaping from long to wide with multiple columns

Time:03-19

I have a longitudinal survey dataset. I want to transform from long to wide so I can determine discrepancies from the first wave to the second wave. I am struggling to transform in R, as this was relatively straightforward coming from a Stata user.

I have a ID and time column with two different response columns. I want the transformed response columns to take the value of the time column in a wide format, but struggling with the code. I get a resulting dataframe with NAs.

df <- tibble(PIN = c(1001, 1001, 1002, 1002), time = c(1, 2, 1, 2), age = c(84, 86, 22, 24), height = c(58, 58, 60, 62))
df_wide <- reshape(df, 
                timevar=c("time"),
                idvar=c("PIN"),
                dir="wide")

Input:

id time age height
1001 1 84 58
1001 2 86 58
1002 1 22 60
1002 2 24 62

Desired Output:

id age_t1 age_t2 height_t1 height_t2
1001 84 86 58 58
1002 22 24 60 62

Actual Output:

id age.c(1,2) height.c(1,2)
1001 NA NA
1002 NA NA

CodePudding user response:

pivot_wider may be easier

library(dplyr)
library(stringr)
library(tidyr)
df %>% 
   mutate(time = str_c('t', time)) %>%
   pivot_wider(names_from = time, values_from = c(age, height))

-output

# A tibble: 2 × 5
    PIN age_t1 age_t2 height_t1 height_t2
  <dbl>  <dbl>  <dbl>     <dbl>     <dbl>
1  1001     84     86        58        58
2  1002     22     24        60        62

With reshape from base R, it may need a sequence column

out <- reshape(transform(df, rn = ave(seq_along(PIN), PIN,
   FUN = seq_along)), idvar = "PIN", 
    direction = "wide", timevar = "time", sep = "_")
out[!startsWith(names(out), 'rn_')]
   PIN age_1 height_1 age_2 height_2
1 1001    84       58    86       58
3 1002    22       60    24       62

CodePudding user response:

Another alternative,

data.table::dcast(as.data.table(df), PIN ~ time, value.var = c("age", "height"))
#      PIN age_1 age_2 height_1 height_2
#    <num> <num> <num>    <num>    <num>
# 1:  1001    84    86       58       58
# 2:  1002    22    24       60       62

In many ways, data.table::dcast is very similar to reshape2::dcast, but this is one place that they differ: the use of multiple value.vars is not supported in reshape2::dcast (as far as I can tell).

  •  Tags:  
  • r
  • Related