Home > database >  Reshaping from long to wide format in R, problem with variables re-naming
Reshaping from long to wide format in R, problem with variables re-naming

Time:04-26

I have a dataset in a long format that I would like to reshape to wide. I normally know how to do it but the problem is with the variable names. The variables in the long format look like this:

ID time WSAS_01
1 1 4
1 2 3
2 1 6
2 2 8

But after reshaping I would like the name of the variables to be like this, so that time 1 is _r1_ (and time 2 is _r2_) and it is in the middle of the name:

ID WSAS_r1_01 WSAS_r2_01
1 4 3
2 6 8

Does anyone know how to do it?

CodePudding user response:

Another way also with tidyr::pivot_wider, but just a little bit more succinct regarding the part of names_glue:

library(tidyr)

df <- read.table(text = "
ID  time    WSAS_01
1   1   4
1   2   3
2   1   6
2   2   8", header=T)

df %>% 
  pivot_wider(names_from=time, values_from=WSAS_01, names_glue="WSAS_r{time}_01")

#> # A tibble: 2 × 3
#>      ID WSAS_r1_01 WSAS_r2_01
#>   <int>      <int>      <int>
#> 1     1          4          3
#> 2     2          6          8

CodePudding user response:

With pivot_wider(), you can supply a glue specification that uses the names_from columns (and special .value) to create custom column names.

library(tidyr)
library(stringr)

df %>%
  pivot_wider(
    names_from = time,
    names_glue = "{str_replace(.value, '(?=_)', str_c('_r', time))}",
    values_from = WSAS_01)

# # A tibble: 2 × 3
#      ID WSAS_r1_01 WSAS_r2_01
#   <int>      <int>      <int>
# 1     1          4          3
# 2     2          6          8

In an extending case that values_from contains multiple values, this method also works:

df <- data.frame(
  ID = rep(1:2, each = 2),
  time = rep(1:2, 2),
  WSAS_01 = c(4, 3, 6, 8),
  WSAS_02 = c(1, 3, 5, 7)
)

df %>%
  pivot_wider(
    names_from = time,
    names_glue = "{str_replace(.value, '(?=_)', str_c('_r', time))}",
    values_from = starts_with("WSAS"))

# # A tibble: 2 × 5
#      ID WSAS_r1_01 WSAS_r2_01 WSAS_r1_02 WSAS_r2_02
#   <int>      <dbl>      <dbl>      <dbl>      <dbl>
# 1     1          4          3          1          3
# 2     2          6          8          5          7

CodePudding user response:

You can try this:

df %>%
  pivot_wider(
    id_cols=ID,
    names_from=time,
    values_from = WSAS_01,
    names_glue="{paste0(str_sub(.value,1,4),'_r', time,'_',str_sub(.value,6,7))}"
)

Output:

# A tibble: 2 × 3
     ID WSAS_r1_01 WSAS_r2_01
  <dbl>      <dbl>      <dbl>
1     1          4          3
2     2          6          8
  • Related