Calculate time difference between first and last row in Excel column or .hdf file-CodePudding

I have the "Datetime" column in Excel and also in a .hdf Dataframe. How can I calculate the time difference (in hours, min or sec) between the first and the last row? here is how my data look like; please remember that my data has few thousands rows. Therefore I cannot write a code and manually add these dates: (P.S. I am very new to python, this is my very first code)

please see the table below to see how it looks like: as you can see, my date and time are in one column:

     Datetime         Header:           Machine_started
2021-02-02 14:33:09   Data              1
2021-02-02 14:33:09   Data              1
2021-02-02 14:33:11   Data              1
2021-02-02 14:41:36   Data              1

CodePudding user response：

I created a demo dataframe:

import pandas as pd
import numpy as np

data = {"Datetime": ['2021-02-02 14:33:09', '2021-02-02 14:33:09', '2021-02-02 14:33:11', '2021-02-02 14:41:36'],
         "Header": ['Data', 'Data','Data','Data'], 
        "1_2_eBeam_started": [1,1,1,1]}

df = pd.DataFrame(data)
# creating dataframe

df['Datetime'].dtype
# dtype is object 
# convert it to datetime

df['Datetime']=pd.to_datetime(df['Datetime'])

df['Datetime'].iloc[0]   # this is first row

df['Datetime'].iloc[-1]   # this is last row

# difference in seconds:

(df['Datetime'].iloc[-1] - df['Datetime'].iloc[0])/np.timedelta64(1,'s')

#output 507.0

# You can also get the difference in minutes, hours, etc. by rplacing 's' by 'm' or 'h' in np.timedelta64(1,'s')