I have the "Datetime" column in Excel and also in a .hdf Dataframe. How can I calculate the time difference (in hours, min or sec) between the first and the last row? here is how my data look like; please remember that my data has few thousands rows. Therefore I cannot write a code and manually add these dates: (P.S. I am very new to python, this is my very first code)
please see the table below to see how it looks like: as you can see, my date and time are in one column:
Datetime Header: Machine_started
2021-02-02 14:33:09 Data 1
2021-02-02 14:33:09 Data 1
2021-02-02 14:33:11 Data 1
2021-02-02 14:41:36 Data 1
CodePudding user response:
I created a demo dataframe:
import pandas as pd
import numpy as np
data = {"Datetime": ['2021-02-02 14:33:09', '2021-02-02 14:33:09', '2021-02-02 14:33:11', '2021-02-02 14:41:36'],
"Header": ['Data', 'Data','Data','Data'],
"1_2_eBeam_started": [1,1,1,1]}
df = pd.DataFrame(data)
# creating dataframe
df['Datetime'].dtype
# dtype is object
# convert it to datetime
df['Datetime']=pd.to_datetime(df['Datetime'])
df['Datetime'].iloc[0] # this is first row
df['Datetime'].iloc[-1] # this is last row
# difference in seconds:
(df['Datetime'].iloc[-1] - df['Datetime'].iloc[0])/np.timedelta64(1,'s')
#output 507.0
# You can also get the difference in minutes, hours, etc. by rplacing 's' by 'm' or 'h' in np.timedelta64(1,'s')