Home > Mobile >  fill missing datetime pandas
fill missing datetime pandas

Time:09-06

I have a following problem. I have this df with 10Min interval:

df_dict = {"value" : [1, 1, 2, 3], "datetime" : ["2022-09-05 07:20:00", "2022-09-05 07:30:00", "2022-09-05 07:20:00", "2022-09-05 07:20:00"],
           "expedice" : ["A", "A", "B", "C"] }

df = pd.DataFrame(df_dict)

I would like to fill missing datetime to have:

df_dict = {"value" : [1, 1, 2, 0, 3, 0], "datetime" : ["2022-09-05 07:20:00", "2022-09-05 07:30:00", "2022-09-05 07:20:00", "2022-09-05 07:30:00", "2022-09-05 07:20:00", "2022-09-05 07:30:00"],
           "expedice" : ["A", "A", "B", "B", "C", "C"] }

df = pd.DataFrame(df_dict)

I tried

df.datetime = pd.to_datetime(df.datetime)


df.set_index(
            ['datetime', 'expedice']
        ).unstack(
            fill_value=0
        ).asfreq(
            "10Min", fill_value=0
        ).stack().sort_index(level=1).reset_index()

But I got an error TypeError: Cannot change data-type for object array.. How can I fix it please?

CodePudding user response:

Use DataFrame.reindex with DatetimeIndex created by minimal and maximal datetime:

df1 = df.set_index(['expedice', 'datetime'])


df1 = (df1.reindex(pd.MultiIndex.from_product([df1.index.levels[0], 
                                               pd.date_range(df1.index.levels[1].min(), 
                                                             df1.index.levels[1].max(), 
                                                             freq='10Min')], 
                                              names=df1.index.names), fill_value=0)
         .reset_index())
print (df1)
  expedice            datetime  value
0        A 2022-09-05 07:20:00      1
1        A 2022-09-05 07:30:00      1
2        B 2022-09-05 07:20:00      2
3        B 2022-09-05 07:30:00      0
4        C 2022-09-05 07:20:00      3
5        C 2022-09-05 07:30:00      0
  • Related