Home > front end >  How do I filter by a certain date and hour using Pandas dataframe in python
How do I filter by a certain date and hour using Pandas dataframe in python

Time:10-19

I am trying to find in a csv of a price chart the price values at a given period of time. I have converted the Datetime column into datetime data with the pd.to_datetime function, however I can not seem to find a method that allows me to filter the rows by separate dates hours and minutes. A typical row looks something like this.

Datetime        2021-10-15 19:55:00-04:00
Open                                40.15
High                                 40.2
Low                                 40.14
Close                               40.15
Volume                                  0
Dividends                               0
Stock Splits                            0
Name: 939, dtype: object
Empty DataFrame
Columns: [Datetime, Open, High, Low, Close, Volume, Dividends, Stock Splits]
Index: []

So far here is my code

import pandas as pd

data = pd.read_csv("Data\\09-16-21 AMC-5min", parse_dates=["Datetime"])

data["Datetime"] = pd.to_datetime(data['Datetime'])

newData = data[(data.Datetime.day == data.Datetime.day.max()) & data.Datetime.hour == 9 & data.Datetime.minute == 30]

print(newData)

in this example I am trying to find the data point on 9:30 of the most recent day. When I try to run this I get the following error

Traceback (most recent call last):
  File "C:\Users\Zach\PycharmProjects\Algotrading\Test.py", line 7, in <module>
    newData = data[(data.Datetime.day == data.Datetime.day.max()) & data.Datetime.hour == 9 & data.Datetime.minute == 30]
  File "C:\Users\Zach\PycharmProjects\Algotrading\venv\lib\site-packages\pandas\core\generic.py", line 5487, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'Series' object has no attribute 'day'

I'm not sure how to access the separate values for day, hour, and minutes. Any advice would be appreciated

CodePudding user response:

You need .dt accessor with () for second and third condition:

newData = data[(data.Datetime.dt.day == data.Datetime.dt.day.max()) & 
               (data.Datetime.dt.hour == 9) & 
               (data.Datetime.dt.minute == 30)]

For converting to days only once:

s = data.Datetime.dt.day
newData = data[(s == s.max()) & 
               (data.Datetime.dt.hour == 9) & 
               (data.Datetime.dt.minute == 30)]
  • Related