Home > other >  Pandas Fill rows missing in date range
Pandas Fill rows missing in date range

Time:05-24

I have a Dataframe that represents a hotel reservations by date:

date        volume  city
2020-01-05      10    NY
2020-01-06      10    NY
2020-01-07      30    NY

The Dataframe is ultimately being written to the Database and because of that I need to have a complete range of dates from a given point in the past to a given point in the future, for example, the Dataframe (for entire 2020) I need should look like this:

date        volume  city
2020-01-01       0    NY
2020-01-02       0    NY
2020-01-03       0    NY
2020-01-04       0    NY
2020-01-05      10    NY
2020-01-06      10    NY
2020-01-07      30    NY
...
2020-12-31       0    NY

It's important that all the rows filling the range have a volume=0 and the city is repeated in the entire dataset.

How can I effective convert my Dataframe to fill dates missing in the range ?

CodePudding user response:

Use DataFrame.reindex with date_range with replace missing values to 0, for column city set NY:

df['date'] = pd.to_datetime(df['date'])

r = pd.date_range('2020-01-01','2020-12-31')
df = df.set_index('date').reindex(r).fillna({'volume':0}).assign(city = 'NY')
print (df)
            volume city
2020-01-01     0.0   NY
2020-01-02     0.0   NY
2020-01-03     0.0   NY
2020-01-04     0.0   NY
2020-01-05    10.0   NY
           ...  ...
2020-12-27     0.0   NY
2020-12-28     0.0   NY
2020-12-29     0.0   NY
2020-12-30     0.0   NY
2020-12-31     0.0   NY

[366 rows x 2 columns]

If possible multiple cities and need date_range for each city create MultiIndex.from_product:

df['date'] = pd.to_datetime(df['date'])

r = pd.date_range('2020-01-01','2020-12-31')

mux = pd.MultiIndex.from_product([r, df['city'].unique()], names=['date','city'])

df = df.set_index(['date', 'city']).reindex(mux, fill_value=0).reset_index()
print (df)
          date city  volume
0   2020-01-01   NY       0
1   2020-01-02   NY       0
2   2020-01-03   NY       0
3   2020-01-04   NY       0
4   2020-01-05   NY      10
..         ...  ...     ...
361 2020-12-27   NY       0
362 2020-12-28   NY       0
363 2020-12-29   NY       0
364 2020-12-30   NY       0
365 2020-12-31   NY       0

[366 rows x 3 columns]
  • Related