I have the following dataframe:
import pandas as pd
from datetime import datetime
df = pd.DataFrame({'Start':['2022-12-07 06:24:48','2022-12-08 14:37:16','2022-12-09 08:00:59','2022-12-12 09:31:23','2022-12-12 11:11:17'],
'End':['2022-12-07 10:07:00','2022-12-08 17:51:21', 'NaN', 'NaN', 'NaN']})
df['Start'] = pd.to_datetime(df['Start'])
df['End'] = pd.to_datetime(df['End'])
How can I replace the NaN values based on following conditions:
If the End Date is the same as today (in this example today = 2022-12-12), I want for NaN values a datetime.now() timestamp.
Elseif the End Date is not today, I want for NaN values the same date as df["Start"] of the row but with the time 23:59:59.
I tried it with fillna function and combine the date of df[Start] with a new time but I get an error because df[Start] is a Series and not an argument. With the mask I can select rows, but I dont want to do this in loop.
df['End'].fillna(datetime.combine(df['Start'].dt.date,datetime.max.time()))
mask_td = (df["Start"].dt.date == datetime.today().date()) & (df["End"].isnull())
mask_ntd = (df["Start"].dt.date != datetime.today().date()) & (df["End"].isnull())
df["End"].loc[mask_td]
CodePudding user response:
You can use numpy.select
:
import numpy as np
condlist=[df['End'].isnull() & (df["Start"].dt.date == datetime.today().date()),
df['End'].isnull() & (df["Start"].dt.date != datetime.today().date())]
choicelist=[datetime.now(),df['Start'].dt.strftime('%Y-%m-%d 23:59:59')]
default=df['End'].astype(str)
df['End']=np.select(condlist,choicelist,default)
df['End'] = df['End'].astype('datetime64[s]')
Output:
Start End
0 2022-12-07 06:24:48 2022-12-07 10:07:00
1 2022-12-08 14:37:16 2022-12-08 17:51:21
2 2022-12-09 08:00:59 2022-12-09 23:59:59
3 2022-12-12 09:31:23 2022-12-12 14:29:21
4 2022-12-12 11:11:17 2022-12-12 14:29:21