I am working on data frame. One of my column is in datetime format. Data head will look like this
0 2022-01-18 15:26:35
1 2022-01-18 15:26:36
2 2022-01-18 15:26:37
3 2022-01-18 15:26:38
4 2022-01-18 15:26:39
Name: filename, dtype: datetime64[ns]
I have a huge data and there is break at some point, meaning, I may not have data at night time and I dont want that time to be added in the total time. I want the total time in hours, I tried this
totaltime = df["filename"].sum()
totaltime
And I got an error, 'DatetimeArray' with dtype datetime64[ns] does not support reduction 'sum'. Can anyone help me please ?
CodePudding user response:
IIUC, you want the number of hours between the max and min date:
t = pd.to_datetime(df['filename'])
delta = (t.max()-t.min()).total_seconds()/3600
output: 0.0011
CodePudding user response:
Use numpy.ptp:
df = pd.read_clipboard(header=None, names=['filename'], sep='\s\s ')`
times = pd.to_datetime(df['filename'])
np.ptp(times)
Output (pd.Timedelta)
Timedelta('0 days 00:00:04')
CodePudding user response:
You can try below code:
df[['colname']] = df[['colname']].apply(pd.to_datetime)
result = (df['colname'].max() - df['colname'].min()).dt.hour
CodePudding user response:
Total time difference between the max and min date.
Your Data:
from datetime import datetime
import time
temp = """
2022-01-18 15:26:35
2022-01-18 15:26:36
2022-01-18 15:26:37
2022-01-18 15:26:38
2022-01-18 15:26:39
"""
temp2 = io.StringIO(temp)
df = pd.read_csv(temp2, sep="\s",header=None)
df["filename"] = df[0] " " df[1]
df = df.drop(columns=[0,1])
# Convert column to datetime datatype
df["filename"] = pd.to_datetime(df["filename"], infer_datetime_format=True)
Parse them into strptime, so that you can subtract the dates and get the range
between them.
maxTime = datetime.strptime(str(df.filename.max()), "%Y-%m-%d %H:%M:%S")
minTime = datetime.strptime(str(df.filename.min()), "%Y-%m-%d %H:%M:%S")
print(f"Range: ",maxTime-minTime)
# Range: 0:00:04