I have a nested data in one long column. The data represents the different items in stock of various stores. Every item has 6 values of No, Item, Expiry Date, Manufacturer, Quantity, Unit Price, Discount and Total Price. The items are listed numerically 1,2,3 and then it starts for a new store 1,2 and then for the next 1,2,3 etc. My goal is to collect all the items of one store in one row and whenever there is a 1 start a new row
1
ABC RH --
Pack
Mar 2022
ABC D --
280
$5.20
$0
$ 1 456.0
2
ABC RH --
Pack
Mar 2022
ABC D --
280
$5.20
$0
$ 1 456.0
1
BCD
Amp
Apr 2024
XYZ
280
$2.00
$0
$ 2 555.0
2
BCD RH --
Amp
Mar 2024
ABC D --
280
$5.20
$0
$ 1 456.0
3
BCD RH --
Pack
Mar 2022
ABC D --
280
$5.20
$0
$ 1 456.0
1
1. HJK
Pack
Mar 2023
D --
80
$0.20
$0
$ 3 456.0
2
2. BCD
Pack
Mar 2022
ABC D --
280
$5.20
$0
$ 1 456.0
3
BCD RH --
Pack
Mar 2022
ABC D --
280
$5.20
$0
$ 1 456.0
4
BCD RH --
Pack
Mar 2022
ABC D --
280
$5.20
$0
$ 1 456.0
5
BCD RH --
Pack
Mar 2022
ABC D --
280
$5.20
$0
$ 1 456.0
6
BCD RH --
Pack
Mar 2022
ABC D --
280
$5.20
$0
$ 1 456.0
and here is what I'm trying to display it as
|No |Item | Size |Exp |Manuf |Qty |Unit |Dis |Total |No| Item|
|1 |ABC RH| Pack |Mar 2022| ABC D| 280 |$5.20|$0 |$ 1 456.0|2 | ABC |
|1 |BCD RH| Amp |Apr 2024| XYZ | 280 |$2.00|$0 |$ 2 555.0|2 | BCD |
|1 |1. HJK| Pack |Mar 2023| D -- | 80 |$0.20|$0 |$ 3 456.0|2.| BCD |
Thank you for your help
CodePudding user response:
You can try a tidyverse
library(tidyverse)
as.data.frame(df) %>%
mutate(V2 = cumsum(df == "1")) %>%
group_by(V2) %>%
mutate(V4 = 1:n()) %>%
pivot_wider(names_from = V4, values_from = df, names_prefix = "Item") %>%
ungroup() %>%
select(-V2)
# A tibble: 3 x 9
Item1 Item2 Item3 Item4 Item5 Item6 Item7 Item8 Item9
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 1 A B C 2 A B NA NA
2 1 A B C D NA NA NA NA
3 1 A 2 A B C 3 A B
CodePudding user response:
A possible solution using base R:
df=c('1','A','B','C','2','A','B','C','1','A','B','C','1','A','B','C','2','A','B','C','3','A','B','C')
df <- data.frame(V=df)
idx <- which(df$V == 1)
n <- length(idx)
leng <- idx[-1]-idx[-n]
leng <- c(leng,length(df$V)-idx[n] 1)
l <- split(df$V, rep(1:n, leng))
l <- lapply(l, `length<-`, max(lengths(l)))
as.data.frame(do.call(rbind,l))