I have the these two following pandas dataframe :
df1 = {'Date' : ['07-10-2019',
'07-10-2019',
'07-10-2019',
'08-10-2019',
'08-10-2019',
'08-10-2019']}
df1 = {'Time' :['07-10-2019 10:47:00',
'07-10-2019 10:52:00',
'07-10-2019 10:59:00',
'08-10-2019 10:47:00',
'08-10-2019 10:52:00',
'08-10-2019 10:59:00',
]}
and I am trying to create d['Time Taken']
by using:
df1['Time Taken']=((pd.to_datetime(df1['Time'])-pd.to_datetime(df1['Date'])).dt.total_seconds())/60
and clearly as expected, I am getting resultant df1['Time Taken']
as:
df1 = {'Time Taken': [647.0,
652.0,
659.0,
647.0,
652.0,
659.0,
]}
However, I would like to see df['Time Taken]
start with NaN
or 0
. How should I write it this line code to make it look like:
df1 = {'Time Taken': [NaN,
5.0,
7.0,
NaN,
5.0,
7.0,
]}
Please, help me with this! Many thanks!
CodePudding user response:
Use DataFrameGroupBy.diff
with Series.dt.total_seconds
and divide 60
:
df1 = pd.DataFrame({'Date' : ['07-10-2019',
'07-10-2019',
'07-10-2019',
'08-10-2019',
'08-10-2019',
'08-10-2019'],
'Time' :['07-10-2019 10:47:00',
'07-10-2019 10:52:00',
'07-10-2019 10:59:00',
'08-10-2019 10:47:00',
'08-10-2019 10:52:00',
'08-10-2019 10:59:00',
]})
df1['Time']= pd.to_datetime(df1['Time'])
df1['Time Taken'] = df1.groupby('Date')['Time'].diff().dt.total_seconds() / 60
print(df1)
Date Time Time Taken
0 07-10-2019 2019-07-10 10:47:00 NaN
1 07-10-2019 2019-07-10 10:52:00 5.0
2 07-10-2019 2019-07-10 10:59:00 7.0
3 08-10-2019 2019-08-10 10:47:00 NaN
4 08-10-2019 2019-08-10 10:52:00 5.0
5 08-10-2019 2019-08-10 10:59:00 7.0