Home > Back-end >  How to create a missing data for broken time in data.table?
How to create a missing data for broken time in data.table?

Time:04-14

here is a sample of my data ;

df <- data.table(Date = c(as.Date('2021-02-09'),as.Date('2021-02-10'),as.Date('2021-02-12'),as.Date('2021-02-13'),
                          as.Date('2021-02-14'),as.Date('2021-02-05'),as.Date('2021-02-07'),as.Date('2021-02-08')),
                 Store = c('A','A','A','A','B','B','B','B'),
                 Product = c(1,1,1,1,2,2,2,2),
                 Quantity = c(3,4,5,6,7,8,9,10))

for each group (Store and Product columns represents a group) sometimes can have a break in the Date column.

I want to keep them as missing in Quantity value. Here is my desired output;

  Date       Store Product Quantity
  <date>     <chr>   <dbl>    <dbl>
  2021-02-09 A           1        3
  2021-02-10 A           1        4
  2021-02-11 A           1       NA
  2021-02-12 A           1        5
  2021-02-13 A           1        6
  2021-02-14 B           2        7
  2021-02-05 B           2        8
  2021-02-06 B           2       NA
  2021-02-07 B           2        9
  2021-02-08 A           2       10

note : I have a limited ram space and a huge table. So operations without reassigning (for example with := operator) would be way better.

Thanks in advance.

CodePudding user response:

A possible solution with J and seq.
Works for Store A, not sure I understood your question correctly for Store B as this creates more rows than your expected result due to Quantity registered on 2021-02-14

df[,.SD[J(Date=seq(min(Date),max(Date),by=1)),on=.(Date)],by=.(Store,Product)]

    Store Product       Date Quantity
    <char>   <num>     <Date>    <num>
 1:      A       1 2021-02-09        3
 2:      A       1 2021-02-10        4
 3:      A       1 2021-02-11       NA
 4:      A       1 2021-02-12        5
 5:      A       1 2021-02-13        6
 6:      B       2 2021-02-05        8
 7:      B       2 2021-02-06       NA
 8:      B       2 2021-02-07        9
 9:      B       2 2021-02-08       10
10:      B       2 2021-02-09       NA
11:      B       2 2021-02-10       NA
12:      B       2 2021-02-11       NA
13:      B       2 2021-02-12       NA
14:      B       2 2021-02-13       NA
15:      B       2 2021-02-14        7
  • Related