Home > Back-end >  Use pivot_longer to seperate columns
Use pivot_longer to seperate columns

Time:09-26

I have a dataframe that looks like

id = c("1", "2", "3")
IN1999 = c(1, 1, 0)
IN2000 = c(1, 0, 1)
TEST1999 = c(10, 12, NA)
TEST2000 = c(15, NA, 11)

df <- data.frame(id, IN1999, IN2000, TEST1999, TEST2000)

I am trying to use pivot_longer to change it into this form:

  id year IN TEST
1  1 1999  1   10
2  1 2000  1   15
3  2 1999  1   12
4  2 2000  0   NA
5  3 1999  0   NA
6  3 2000  1   11

My current code looks like this

df %>%
  pivot_longer(col = !id, names_to = c(".value", "year"), 
               names_sep = 4)

but obviousely by setting names_sep = 4, r cuts IN1999 and IN2000 at the wrong place. How can I set the argument so that r can separate the column name from the last four digits?

CodePudding user response:

The names_sep-argument in pivot_longer also accepts regex expressions, that will allow you to split before the occurrence of four digits as in this example below:

library(tidyr)

df |>
  pivot_longer(col = !id, names_to = c(".value", "year"), 
               names_sep = "(?=\\d{4})")

Output:

# A tibble: 6 × 4
  id    year     IN  TEST
  <chr> <chr> <dbl> <dbl>
1 1     1999      1    10
2 1     2000      1    15
3 2     1999      1    12
4 2     2000      0    NA
5 3     1999      0    NA
6 3     2000      1    11
  • Related