Home > Net >  pivot_longer for multiple sets having the same names_to
pivot_longer for multiple sets having the same names_to

Time:11-02

I am trying a pivot_longer with multiple variable sets and I'm having trouble getting the syntax right from examples.

My dummy dataset is:

library(dplyr)
library(tidyr)

ID =  c("id-1", "id-2", "id-3")
State = c("MD", "MD", "VA")
Time1Day= c( 1, 12, 30)
Time1Month = c( 1, 4, 5)
Time2Day = c( 9, 21, 13)
Time2Month = c( 12, 4, 5)
Time3Day = c( 7, 14, NA)
Time3Month = c( 1, 2, NA)


df <-data.frame(ID, State, Time1Day, Time1Month, Time2Day, Time2Month, Time3Day, Time3Month)

My desired outcome is:

    ID State  Time Day Month
1 id-1    MD Time1   1     1
2 id-1    MD Time2   9    12
3 id-1    MD Time3   7     1
4 id-2    MD Time1  12     4
5 id-2    MD Time2  21     4
6 id-2    MD Time3  14     2
7 id-3    VA Time1  30     5
8 id-3    VA Time2  13     5

I have looked here and here to try to get the syntax right, and tried the following two solutions, which I cannot get to work:

df.long <- df %>% 
  pivot_longer(cols = starts_with("Time"), names_to = c("Day", "Month"), names_sep="(?=[0-9])"), values_to = "Time", values_drop_na = TRUE)

df.long <- df %>% 
  pivot_longer(cols = ends_with("Day"), names_to = c("Time"), values_to = "Days", values_drop_na = TRUE) %>% 
  pivot_longer(cols = ends_with("Month"), names_to = c("Time"), values_to = "Months", values_drop_na = TRUE)

Any advice on what I am missing and how to fix it would be greatly appreciated

CodePudding user response:

Edit Added values_drop_na = TRUE thanks to TarJae's comment.

You could use

library(dplyr)
library(tidyr)

df %>% 
  pivot_longer(-c(ID, State), 
               names_to = c("Time", ".value"),
               names_pattern = "(Time\\d)(.*)",
               values_drop_na = TRUE)

This returns

# A tibble: 9 x 5
  ID    State Time    Day Month
  <chr> <chr> <chr> <dbl> <dbl>
1 id-1  MD    Time1     1     1
2 id-1  MD    Time2     9    12
3 id-1  MD    Time3     7     1
4 id-2  MD    Time1    12     4
5 id-2  MD    Time2    21     4
6 id-2  MD    Time3    14     2
7 id-3  VA    Time1    30     5
8 id-3  VA    Time2    13     5

CodePudding user response:

a data.table approach

library(data.table)
# melt to long
DT <- melt(setDT(df), id.vars = c("ID", "State"), variable.factor = FALSE, na.rm = TRUE)
# split variable string
DT[, c("Time", "part2") := tstrsplit(variable, "(?<=[0-9])", perl=TRUE)]
# recast to wide
dcast(DT, ID   State   Time ~ part2, value.var = "value", drop = TRUE)
#      ID State  Time Day Month
# 1: id-1    MD Time1   1     1
# 2: id-1    MD Time2   9    12
# 3: id-1    MD Time3   7     1
# 4: id-2    MD Time1  12     4
# 5: id-2    MD Time2  21     4
# 6: id-2    MD Time3  14     2
# 7: id-3    VA Time1  30     5
# 8: id-3    VA Time2  13     5
  • Related