Home > Software design >  Pandas dataframe drop multiple rows based on datetime difference
Pandas dataframe drop multiple rows based on datetime difference

Time:08-20

I store datetimes in a pandas dataframe which look like dd/mm/yyyy hh:mm:ss

I want to drop all rows where values in column x (datetime) are within 24 hours of one another.

On a 1 by 1 basis, I was previously doing this, which doesn't seem to work within the drop function:

df.drop(df[(df['d2'] - df['d1']).seconds / 3600 < 24].index)
>> AttributeError: 'Series' object has no attribute 'seconds'

CodePudding user response:

This should work

df.loc[ (df.d2 - df.d1) >= datetime.timedelta(days=1) ]

CodePudding user response:

the answer is very easy

import pandas as pd
df = pd.read_csv("test.csv")
df["d1"] = pd.to_datetime(df["d1"])
df["d2"] = pd.to_datetime(df["d2"])

now if you tried to subtract columns from each other

df["first"] - df["second"]

output will be in days and hence and as what @kaan suggested

df.loc[(df["d2"] - df["d1"]) >= pd.Timedelta(days=1)] 
  • Related