Home > Enterprise >  Filter datetime column by every 15 minute interval
Filter datetime column by every 15 minute interval

Time:12-30

The problem: I have a dataframe with a datetime column(formated in datetime format of python), which contains a reading for example, 2020-01-03T00:00:00.000Z, 2020-01-03T00:05:00.000Z and so on, until 2020-01-03T00:23:55.000Z, for different dates.

I want to filter the entire dataframe based on this column but only keep readings at every 0th, 15th, 30th, 45th minute.

I saw another question which did something similar with pd.date_range(start, freq='15T', periods=len(df)), but the question is not the same. Thank you.

CodePudding user response:

What about grouping your data in intervals and applying whatever aggregation/transform you need on top ?

from datetime import datetime

import pandas as pd

df = pd.DataFrame(
    {
        "dt": [
            datetime(2022, 12, 29, 15, 23, 0),
            datetime(2022, 12, 29, 15, 38, 0),
            datetime(2022, 12, 29, 15, 43, 0),
            datetime(2022, 12, 29, 16, 11, 0),
        ],
        "dat": [1, 2, 3, 4],
    }
)

groups = df.groupby(
    pd.Grouper(key="dt", freq="15min", origin="start_day", label="left")
)

groups.first()

gives

                     dat
dt                      
2022-12-29 15:15:00  1.0
2022-12-29 15:30:00  2.0
2022-12-29 15:45:00  NaN
2022-12-29 16:00:00  4.0

CodePudding user response:

I was able to this in a easy and elegant way,

let us assume the dataframe is called df and and the column in question is called 'datetime', here is the solution:

import datetime as dt # in case not already imported!
df['minute'] = df['datetime'].dt.minute
df= df[df['minute'].isin([0, 15, 30, 45])]

CodePudding user response:

you can resample the date index for 15 minute intervals and bfill the data.

start_date="12/1/2022"
end_date="12/3/2022"
df=pd.DataFrame(pd.date_range(start_date, end_date,freq="D"),columns=["Date"])
df['Date']=df['Date'].astype('datetime64[ns]')
df.set_index('Date',inplace=True)
df=df.asfreq('15T', method='bfill')
for item in df.index:
    print(item)
  • Related