Home > other >  how to unstack a long column with uneven number of rows in r and name the columns
how to unstack a long column with uneven number of rows in r and name the columns

Time:10-21

I have a nested data in one long column. The data represents the different items in stock of various stores. Every item has 6 values of No, Item, Expiry Date, Manufacturer, Quantity, Unit Price, Discount and Total Price. The items are listed numerically 1,2,3 and then it starts for a new store 1,2 and then for the next 1,2,3 etc. My goal is to collect all the items of one store in one row and whenever there is a 1 start a new row


1
ABC RH --
Pack
Mar 2022
ABC D --
280
$5.20 
$0 
$ 1 456.0
2
ABC RH --
Pack
Mar 2022
ABC D --
280
$5.20 
$0 
$ 1 456.0
1
BCD
Amp
Apr 2024
XYZ
280
$2.00 
$0 
$ 2 555.0
2
BCD RH --
Amp
Mar 2024
ABC D --
280
$5.20 
$0 
$ 1 456.0
3
BCD RH --
Pack
Mar 2022
ABC D --
280
$5.20 
$0 
$ 1 456.0
1
1. HJK
Pack
Mar 2023
D --
80
$0.20 
$0 
$ 3 456.0
2
2. BCD
Pack
Mar 2022
ABC D --
280
$5.20 
$0 
$ 1 456.0
3
BCD RH --
Pack
Mar 2022
ABC D --
280
$5.20 
$0 
$ 1 456.0
4
BCD RH --
Pack
Mar 2022
ABC D --
280
$5.20 
$0 
$ 1 456.0
5
BCD RH --
Pack
Mar 2022
ABC D --
280
$5.20 
$0 
$ 1 456.0
6
BCD RH --
Pack
Mar 2022
ABC D --
280
$5.20 
$0 
$ 1 456.0

and here is what I'm trying to display it as

 |No |Item  | Size |Exp     |Manuf |Qty  |Unit |Dis |Total    |No| Item|
 |1  |ABC RH| Pack |Mar 2022| ABC D| 280 |$5.20|$0  |$ 1 456.0|2 | ABC |
 |1  |BCD RH| Amp  |Apr 2024| XYZ  | 280 |$2.00|$0  |$ 2 555.0|2 | BCD |
 |1  |1. HJK| Pack |Mar 2023| D -- | 80  |$0.20|$0  |$ 3 456.0|2.| BCD |

Thank you for your help

CodePudding user response:

You can try a tidyverse

library(tidyverse)
as.data.frame(df) %>% 
  mutate(V2 = cumsum(df == "1")) %>% 
  group_by(V2) %>%
  mutate(V4 = 1:n()) %>% 
  pivot_wider(names_from = V4, values_from = df, names_prefix = "Item") %>% 
  ungroup() %>% 
  select(-V2)
  # A tibble: 3 x 9
  Item1 Item2 Item3 Item4 Item5 Item6 Item7 Item8 Item9
  <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 1     A     B     C     2     A     B     NA    NA   
2 1     A     B     C     D     NA    NA    NA    NA   
3 1     A     2     A     B     C     3     A     B   

CodePudding user response:

A possible solution using base R:

df=c('1','A','B','C','2','A','B','C','1','A','B','C','1','A','B','C','2','A','B','C','3','A','B','C')
df <- data.frame(V=df)

idx <- which(df$V == 1)
n <- length(idx)

leng <- idx[-1]-idx[-n]
leng <- c(leng,length(df$V)-idx[n] 1)

l <- split(df$V, rep(1:n, leng))
l <- lapply(l, `length<-`, max(lengths(l)))

as.data.frame(do.call(rbind,l))
  • Related