The problem:
I have a dataframe with a datetime column(formated in datetime format of python), which contains a reading for example, 2020-01-03T00:00:00.000Z
, 2020-01-03T00:05:00.000Z
and so on, until 2020-01-03T00:23:55.000Z
, for different dates.
I want to filter the entire dataframe based on this column but only keep readings at every 0th, 15th, 30th, 45th minute.
I saw another question which did something similar with pd.date_range(start, freq='15T', periods=len(df))
, but the question is not the same. Thank you.
CodePudding user response:
What about grouping your data in intervals and applying whatever aggregation/transform you need on top ?
from datetime import datetime
import pandas as pd
df = pd.DataFrame(
{
"dt": [
datetime(2022, 12, 29, 15, 23, 0),
datetime(2022, 12, 29, 15, 38, 0),
datetime(2022, 12, 29, 15, 43, 0),
datetime(2022, 12, 29, 16, 11, 0),
],
"dat": [1, 2, 3, 4],
}
)
groups = df.groupby(
pd.Grouper(key="dt", freq="15min", origin="start_day", label="left")
)
groups.first()
gives
dat
dt
2022-12-29 15:15:00 1.0
2022-12-29 15:30:00 2.0
2022-12-29 15:45:00 NaN
2022-12-29 16:00:00 4.0
CodePudding user response:
I was able to this in a easy and elegant way,
let us assume the dataframe is called df and and the column in question is called 'datetime', here is the solution:
import datetime as dt # in case not already imported!
df['minute'] = df['datetime'].dt.minute
df= df[df['minute'].isin([0, 15, 30, 45])]
CodePudding user response:
you can resample the date index for 15 minute intervals and bfill the data.
start_date="12/1/2022"
end_date="12/3/2022"
df=pd.DataFrame(pd.date_range(start_date, end_date,freq="D"),columns=["Date"])
df['Date']=df['Date'].astype('datetime64[ns]')
df.set_index('Date',inplace=True)
df=df.asfreq('15T', method='bfill')
for item in df.index:
print(item)