I have a dataframe that looks like
id = c("1", "2", "3")
IN1999 = c(1, 1, 0)
IN2000 = c(1, 0, 1)
TEST1999 = c(10, 12, NA)
TEST2000 = c(15, NA, 11)
df <- data.frame(id, IN1999, IN2000, TEST1999, TEST2000)
I am trying to use pivot_longer to change it into this form:
id year IN TEST
1 1 1999 1 10
2 1 2000 1 15
3 2 1999 1 12
4 2 2000 0 NA
5 3 1999 0 NA
6 3 2000 1 11
My current code looks like this
df %>%
pivot_longer(col = !id, names_to = c(".value", "year"),
names_sep = 4)
but obviousely by setting names_sep = 4
, r cuts IN1999
and IN2000
at the wrong place. How can I set the argument so that r can separate the column name from the last four digits?
CodePudding user response:
The names_sep
-argument in pivot_longer
also accepts regex expressions, that will allow you to split before the occurrence of four digits as in this example below:
library(tidyr)
df |>
pivot_longer(col = !id, names_to = c(".value", "year"),
names_sep = "(?=\\d{4})")
Output:
# A tibble: 6 × 4
id year IN TEST
<chr> <chr> <dbl> <dbl>
1 1 1999 1 10
2 1 2000 1 15
3 2 1999 1 12
4 2 2000 0 NA
5 3 1999 0 NA
6 3 2000 1 11