Home > other >  R Language to convert the data frame into specific format
R Language to convert the data frame into specific format

Time:03-08

I have the following data frame that I am trying to convert into a specific format. The new items with their quantity are added daily as columns. I want to detect all the items in R language and their quantity automatically and convert into to mentioned expected output.

My dataframe: (new items with their quantity added everyday as new columns)

id    date        a1_item   a1_quantity   a2_item   a2_quantity
z1   2022-02-28   name1       10           name11     20
z1   2021-10-31   name2       20           name21     30
z2   2021-12-31   name3       10           name31     40
r3   2021-10-31   name4       40           name41     10
r4   2021-06-30   name5       30           name51     05
r5   2021-08-31   name6       10           name61     40

My expected output:

id    date         item   quanity
z1   2022-02-28   name1       10 
z1   2022-02-28   name11      20       
z1   2021-10-31   name2       20
z1   2021-10-31   name21      30         
z2   2021-12-31   name3       10
z2   2021-12-31   name31      40            
r3   2021-10-31   name4       40
r3   2021-10-31   name41      10            
r4   2021-06-30   name5       30
r4   2021-06-30   name51      05           
r5   2021-08-31   name6       10 
r5   2021-08-31   name61      40

       

CodePudding user response:

We can use pivot_longer with names_pattern to capture the substring as a group

library(dplyr)
library(tidyr)
df1 %>% 
  pivot_longer(cols = contains("_"), names_to = c(".value"), 
       names_pattern = ".*_(.*)")

-output

# A tibble: 12 × 4
   id    date       item   quantity
   <chr> <chr>      <chr>     <int>
 1 z1    2022-02-28 name1        10
 2 z1    2022-02-28 name11       20
 3 z1    2021-10-31 name2        20
 4 z1    2021-10-31 name21       30
 5 z2    2021-12-31 name3        10
 6 z2    2021-12-31 name31       40
 7 r3    2021-10-31 name4        40
 8 r3    2021-10-31 name41       10
 9 r4    2021-06-30 name5        30
10 r4    2021-06-30 name51        5
11 r5    2021-08-31 name6        10
12 r5    2021-08-31 name61       40

data

df1 <- structure(list(id = c("z1", "z1", "z2", "r3", "r4", "r5"), 
date = c("2022-02-28", 
"2021-10-31", "2021-12-31", "2021-10-31", "2021-06-30", "2021-08-31"
), a1_item = c("name1", "name2", "name3", "name4", "name5", "name6"
), a1_quantity = c(10L, 20L, 10L, 40L, 30L, 10L), a2_item = c("name11", 
"name21", "name31", "name41", "name51", "name61"), a2_quantity = c(20L, 
30L, 40L, 10L, 5L, 40L)), class = "data.frame", row.names = c(NA, 
-6L))
  • Related