I have a Dataframe
like :
date col1 col2
0 2022-10-07 04:00:00 x x1
1 2022-10-08 04:00:00 y x2
I need to update a row (as dictionary) in a specific date if exist, and if it does not exist, insert the row next to the closest date.
For this new given date 2022-10-07 05:00:00
(one hour later) and dic {col1:z}
I would like to get :
date col1 col2
0 2022-10-07 04:00:00 x x1
1 2022-10-07 05:00:00 z x1
2 2022-10-08 04:00:00 y x2
Currently I am doing this:
def write(date,dic):
m = df['date'] == date
if m.any():
df.loc[df['date'] == date, list(dic.keys())] = list(dic.values())
else:
df.loc[len(df)] = {**dic, **{'date':date}}
Which means that if I can't find the date, I just add the row to the end of the df, but I want to insert it right after the previous date.
(Please also see that since when I insert/update I only have col1
, so col2
value will be copied from previous row somehow with ffill
)
CodePudding user response:
You can set the date as index and update like:
df1 = df.set_index('date')
df1.loc[new_date, dic.keys()] = dic.values()
df = df1.sort_index().reset_index().ffill()
It will insert new date if it doesn't exist. If it exists it will update the record at that index.