Home > Enterprise >  Create monthly resampled Pandas DataFrame based on dates
Create monthly resampled Pandas DataFrame based on dates

Time:11-02

I am struggling trying convert a dataframe containing multiple datetime columns into a version with a daterange.

A smaller example would look somewhat like this:

So from:

   key  date_upgrade1 date_upgrade2
0  key1 2020-07-31 2020-08-31
1  key2 2020-05-31 NaT

to:

    month       key upgrade1 upgrade2
0   2020-06-01  key1    0   0
1   2020-07-01  key1    0   0
2   2020-08-01  key1    1   0
3   2020-09-01  key1    1   1
4   2020-06-01  key2    1   0
5   2020-07-01  key2    1   0
6   2020-08-01  key2    1   0
7   2020-09-01  key2    1   0

where upgrade1 and upgrade 2 are "0" if the upgrade has not yet been done and "1" if the upgrade was already performed.

Actually, I am quite lost at the moment. I managed to create the empty date_range dataframe. But I am not sure how to duplicate it for multiple keys and change all values after the upgrade date to 1.

Any help is appreciated.

CodePudding user response:

Answered the question myself in the meantime:

df = [
        df1.assign(month=t)
        for t in pd.date_range(first_date, now, freq='MS')
    ]

df = pd.concat(df)
df.reset_index(drop=True, inplace=True)

df["upgrade1"] = df.month.gt(df.date_upgrade1).astype('int')
df["upgrade2"] = df.month.gt(df.date_upgrade2).astype('int')

with df1 defined as the upper DataFrame.

  • Related