I store datetimes in a pandas dataframe which look like dd/mm/yyyy hh:mm:ss
I want to drop all rows where values in column x
(datetime) are within 24 hours of one another.
On a 1 by 1 basis, I was previously doing this, which doesn't seem to work within the drop function:
df.drop(df[(df['d2'] - df['d1']).seconds / 3600 < 24].index)
>> AttributeError: 'Series' object has no attribute 'seconds'
CodePudding user response:
This should work
df.loc[ (df.d2 - df.d1) >= datetime.timedelta(days=1) ]
CodePudding user response:
the answer is very easy
import pandas as pd
df = pd.read_csv("test.csv")
df["d1"] = pd.to_datetime(df["d1"])
df["d2"] = pd.to_datetime(df["d2"])
now if you tried to subtract columns from each other
df["first"] - df["second"]
output will be in days and hence and as what @kaan suggested
df.loc[(df["d2"] - df["d1"]) >= pd.Timedelta(days=1)]