I have the following data frame that I am trying to convert into a specific format. The new items with their quantity are added daily as columns. I want to detect all the items in R language and their quantity automatically and convert into to mentioned expected output.
My dataframe: (new items with their quantity added everyday as new columns)
id date a1_item a1_quantity a2_item a2_quantity
z1 2022-02-28 name1 10 name11 20
z1 2021-10-31 name2 20 name21 30
z2 2021-12-31 name3 10 name31 40
r3 2021-10-31 name4 40 name41 10
r4 2021-06-30 name5 30 name51 05
r5 2021-08-31 name6 10 name61 40
My expected output:
id date item quanity
z1 2022-02-28 name1 10
z1 2022-02-28 name11 20
z1 2021-10-31 name2 20
z1 2021-10-31 name21 30
z2 2021-12-31 name3 10
z2 2021-12-31 name31 40
r3 2021-10-31 name4 40
r3 2021-10-31 name41 10
r4 2021-06-30 name5 30
r4 2021-06-30 name51 05
r5 2021-08-31 name6 10
r5 2021-08-31 name61 40
CodePudding user response:
We can use pivot_longer
with names_pattern
to capture the substring as a group
library(dplyr)
library(tidyr)
df1 %>%
pivot_longer(cols = contains("_"), names_to = c(".value"),
names_pattern = ".*_(.*)")
-output
# A tibble: 12 × 4
id date item quantity
<chr> <chr> <chr> <int>
1 z1 2022-02-28 name1 10
2 z1 2022-02-28 name11 20
3 z1 2021-10-31 name2 20
4 z1 2021-10-31 name21 30
5 z2 2021-12-31 name3 10
6 z2 2021-12-31 name31 40
7 r3 2021-10-31 name4 40
8 r3 2021-10-31 name41 10
9 r4 2021-06-30 name5 30
10 r4 2021-06-30 name51 5
11 r5 2021-08-31 name6 10
12 r5 2021-08-31 name61 40
data
df1 <- structure(list(id = c("z1", "z1", "z2", "r3", "r4", "r5"),
date = c("2022-02-28",
"2021-10-31", "2021-12-31", "2021-10-31", "2021-06-30", "2021-08-31"
), a1_item = c("name1", "name2", "name3", "name4", "name5", "name6"
), a1_quantity = c(10L, 20L, 10L, 40L, 30L, 10L), a2_item = c("name11",
"name21", "name31", "name41", "name51", "name61"), a2_quantity = c(20L,
30L, 40L, 10L, 5L, 40L)), class = "data.frame", row.names = c(NA,
-6L))