I am having trouble with rearranging data from wide to long format using two ids from a single column.
The column names look as follows:
Y_2022 or N_2022.
I would like to split the column name in two, and then use a) Yes/No parameter as the first id, and b) the year as the second id.
Here's a reproducible code chunk:
d = structure(list(SCRIPT = c("BC", "BC", "RC", "RC"),
INSTITUTE = c("BCR","BCR", "CC", "CC"), CLASS = c("BANK", "CORPORATION","RETAIL", "WHOLESALE"),
Y_2022 = c(0.86, 3.32, 0.86, 3.35), Y_2023 = c(0.87, 0.86, 0.19, 0.25),
N_2022 = c(-0.86,0.7, 0.06, 0.06), N_2023 = c(0.86, 0.03, 0.86, 0.04)),
row.names = c(NA,-4L), class = c("tbl_df", "tbl", "data.frame"))
If it were without a Y/N variant, I would have done the following:
library(tidyr)
#names(d) = gsub(pattern = "Y_*", replacement = "", x = names(d))
d <- pivot_longer(d, cols=4:7, names_to = "Year", values_to = "EUR")
Currently, my dataset looks like this:
Script| Institute | Class | Y_2022 | Y_2023 | N_2022 | N_2023
BC | BCR | Retail | 0.86 | 0.86 | 0.86 | 0.86
RC | BCR | Retail | 0.86 | 0.86 | 0.86 | 0.86
BC | CC | Retail | 0.86 | 0.86 | 0.86 | 0.86
RC | CC | Retail | 0.86 | 0.86 | 0.86 | 0.86
And I need it to look like this:
Script | Institute | Class | Year | Yes-No | EUR
BC | BCR | Retail | 2022 | Y_ | 0.86
RC | BCR | Retail | 2023 | Y_ | 0.86
BC | BCR | Retail | 2022 | N_ | 0.86
RC | BCR | Retail | 2023 | N_ | 0.86
CodePudding user response:
You could use pivot_longer()
followed by separate()
pivot_longer(d,-(1:3),values_to = "EUR") %>%
separate(name,into=c("Yes-No", "Year"), sep="_")
Or, combine together as suggested by Onyanbu:
pivot_longer(d, -(1:3), names_to = c('Yes-No', 'Year'), names_sep = '_', values_to = 'EUR')
Output:
SCRIPT INSTITUTE CLASS `Yes-No` Year EUR
<chr> <chr> <chr> <chr> <chr> <dbl>
1 BC BCR BANK Y 2022 0.86
2 BC BCR BANK Y 2023 0.87
3 BC BCR BANK N 2022 -0.86
4 BC BCR BANK N 2023 0.86
5 BC BCR CORPORATION Y 2022 3.32
6 BC BCR CORPORATION Y 2023 0.86
7 BC BCR CORPORATION N 2022 0.7
8 BC BCR CORPORATION N 2023 0.03
9 RC CC RETAIL Y 2022 0.86
10 RC CC RETAIL Y 2023 0.19
11 RC CC RETAIL N 2022 0.06
12 RC CC RETAIL N 2023 0.86
13 RC CC WHOLESALE Y 2022 3.35
14 RC CC WHOLESALE Y 2023 0.25
15 RC CC WHOLESALE N 2022 0.06
16 RC CC WHOLESALE N 2023 0.04