Home > Enterprise >  How to properly format my panel data in R?
How to properly format my panel data in R?

Time:11-07

I know that similar questions exist but I have tried everything from melt to reshape to Stacked, etc. and nothing is getting me even close.

My data currently looks like this:

ID Treatment Round_1_Decision1 Round_1_Decision2 Round_2_Decision1 Round_2_Decision2
1      2            1                0                0                1
2      1            0                0                1                1

I need it to instead look like this:

ID Treatment Round Decision1 Decision2
1      2       1       1         0
1      2       2       0         1
2      1       1       0         0
2      1       2       1         1

Any suggestions?

CodePudding user response:

We may use pivot_longer - main things to consider is the names_to and names_pattern. Here, we need 'Round' column that should generate the values as the suffix of 'Round' in column name and the other column as the value of the column (.value). In the names_pattern, capture the column names substring as a group with capture group ((...)) i.e. -\\w - should match the 'Round', followed by the _, then capture the digits((\\d )), then the underscore (_) followed by the next capture group ((.*)) that includes the rest of the characters for the .value part

library(tidyr)
pivot_longer(df1, cols = starts_with("Round"),
     names_to = c("Round", ".value"), names_pattern = "\\w _(\\d )_(.*)")

-output

# A tibble: 4 × 5
     ID Treatment Round Decision1 Decision2
  <int>     <int> <chr>     <int>     <int>
1     1         2 1             1         0
2     1         2 2             0         1
3     2         1 1             0         0
4     2         1 2             1         1

data

df1 <- structure(list(ID = 1:2, Treatment = 2:1, Round_1_Decision1 = 1:0, 
    Round_1_Decision2 = c(0L, 0L), 
Round_2_Decision1 = 0:1, Round_2_Decision2 = c(1L, 
    1L)), class = "data.frame", row.names = c(NA, -2L))

CodePudding user response:

Here is an alternative approach without using names_pattern argument

library(tidyverse)

df %>% 
  pivot_longer(
    cols=-c(ID, Treatment),
    names_to = "Round",
    values_to = "value"
  ) %>% 
  mutate(Decision = str_sub(Round, -9, -1),
         Round = parse_number(Round)
         ) %>% 
  pivot_wider(
    names_from = Decision,
    values_from = value
  )
     ID Treatment Round Decision1 Decision2
  <int>     <int> <dbl>     <int>     <int>
1     1         2     1         1         0
2     1         2     2         0         1
3     2         1     1         0         0
4     2         1     2         1         1
  • Related