Home > Enterprise >  How to correctly find which rows from a df have a time value that matches with a time interval on Py
How to correctly find which rows from a df have a time value that matches with a time interval on Py

Time:03-13

I have the following df, from which every cell in it (except index ones) are string types :

    Time    Currency    Volatility expected         Event
0   02:00     GBP   Low Volatility Expected         Construction Output (MoM) (Jan)
1   02:00     GBP   Low Volatility Expected         U.K. Construction Output (YoY) (Jan)
2   02:00     GBP   High Volatility Expected        GDP (YoY)
3   02:00     GBP   High Volatility Expected        GDP (MoM)
4   02:00     GBP   Low Volatility Expected         Index of Services
5   02:00     GBP   Low Volatility Expected         Industrial Production (YoY) (Jan)
6   02:00     GBP   Moderate Volatility Expected    Industrial Production (MoM) (Jan)
7   02:00     GBP   High Volatility Expected        Manufacturing Production (MoM) (Jan)
8   02:00     GBP   Low Volatility Expected         Manufacturing Production (YoY) (Jan)
9   02:00     GBP   High Volatility Expected        Monthly GDP 3M/3M Change
10  02:00     GBP   Moderate Volatility Expected    Trade Balance (Jan)
11  02:00     GBP   Moderate Volatility Expected    Trade Balance Non-EU (Jan)
12  02:00     EUR   Moderate Volatility Expected    German CPI (MoM) (Feb)
13  02:00     EUR   Low Volatility Expected         German CPI (YoY) (Feb)
14  02:00     EUR   Low Volatility Expected         German HICP (MoM) (Feb)
15  02:00     EUR   Low Volatility Expected         German HICP (YoY) (Feb)
16  03:00     EUR   Moderate Volatility Expected    Spanish CPI (YoY) (Feb)
17  03:00     EUR   Low Volatility Expected         Spanish CPI (MoM) (Feb)
18  03:00     EUR   Moderate Volatility Expected    Spanish HICP (YoY) (Feb)
19  03:00     EUR   Low Volatility Expected         Spanish HICP (MoM) (Feb)
20  03:00     CNY   Low Volatility Expected         Chinese Total Social Financing (Feb)
21  03:01     CNY   Low Volatility Expected         M2 Money Stock (YoY) (Feb)
22  03:01     CNY   Moderate Volatility Expected    New Loans (Feb)
23  03:01     CNY   Low Volatility Expected         Outstanding Loan Growth (YoY) (Feb)
24  04:30     GBP   Low Volatility Expected         Inflation Expectations
25  05:00     EUR   High Volatility Expected        EU Leaders Summit  
26  05:10     EUR   Low Volatility Expected         Italian 15-Year BTP Auction
27  05:10     EUR   Low Volatility Expected         Italian 3-Year BTP Auction
28  05:10     EUR   Low Volatility Expected         Italian 7-Year BTP Auction
29  06:00     EUR   Low Volatility Expected         Spanish Consumer Confidence
30  06:30     INR   Low Volatility Expected         Bank Loan Growth
31  06:30     INR   Low Volatility Expected         Deposit Growth
32  06:30     INR   Low Volatility Expected         FX Reserves, USD
33  07:00     INR   Low Volatility Expected         Cumulative Industrial Production (Jan)
34  07:00     INR   Low Volatility Expected         Industrial Production (YoY) (Jan)
35  07:00     INR   Low Volatility Expected         Manufacturing Output (MoM) (Jan)
36  07:00     BRL   Moderate Volatility Expected    CPI (YoY) (Feb)
37  07:00     BRL   Moderate Volatility Expected    CPI (MoM) (Feb)
38  08:06     BRL   Moderate Volatility Expected    Brazilian IPCA Inflation Index SA (MoM) (Feb)
39  08:30     CAD   Low Volatility Expected         Capacity Utilization Rate (Q4)
40  08:30     CAD   High Volatility Expected        Employment Change (Feb)
41  08:30     CAD   Low Volatility Expected         Full Employment Change (Feb)
42  08:30     CAD   Low Volatility Expected         Part Time Employment Change (Feb)
43  08:30     CAD   Low Volatility Expected         Participation Rate (Feb)
44  08:30     CAD   Moderate Volatility Expected    Unemployment Rate (Feb)
45  10:00     USD   Low Volatility Expected         Michigan 5-Year Inflation Expectations (Mar)  

From that df, I'm only interested in those rows (if there are any) that matches with this time interval (24 hours):

  • Starting at 04:00
  • Finishing at 08:59

As every single cell in the column Time contains string values, I created the following function to convert any of those values into datetime.time objects:

import datetime

def convert_string_to_time(str):
    if len(str) < 5 and len(str) > 3:
        return datetime.time(hour=int(str[0]), minute=int(str[2:4]))
    elif len(str) == 5:
        return datetime.time(hour=int(str[0:2]), minute=int(str[3:5]))
    else:
        return 'not a valid string time'

Example of use:

time1 = '04:35'

timestamp1 = convert_string_to_time(time1)

print(type(timestamp1))

print(timestamp1)

Output:

<class 'datetime.time'>

04:35:00

But now I'm stuck at the part where I need to use the function above to create the following output and save it in a sub_df:

    Time    Currency    Volatility expected        Event
24  04:30     GBP   Low Volatility Expected        Inflation Expectations
25  05:00     EUR   High Volatility Expected       EU Leaders Summit  
26  05:10     EUR   Low Volatility Expected        Italian 15-Year BTP Auction
27  05:10     EUR   Low Volatility Expected        Italian 3-Year BTP Auction
28  05:10     EUR   Low Volatility Expected        Italian 7-Year BTP Auction
29  06:00     EUR   Low Volatility Expected        Spanish Consumer Confidence
30  06:30     INR   Low Volatility Expected        Bank Loan Growth
31  06:30     INR   Low Volatility Expected        Deposit Growth
32  06:30     INR   Low Volatility Expected        FX Reserves, USD
33  07:00     INR   Low Volatility Expected        Cumulative Industrial Production (Jan)
34  07:00     INR   Low Volatility Expected        Industrial Production (YoY) (Jan)
35  07:00     INR   Low Volatility Expected        Manufacturing Output (MoM) (Jan)
36  07:00     BRL   Moderate Volatility Expected   CPI (YoY) (Feb)
37  07:00     BRL   Moderate Volatility Expected   CPI (MoM) (Feb)
38  08:06     BRL   Moderate Volatility Expected   Brazilian IPCA Inflation Index SA (MoM)(Feb)
39  08:30     CAD   Low Volatility Expected        Capacity Utilization Rate (Q4)
40  08:30     CAD   High Volatility Expected       Employment Change (Feb)
41  08:30     CAD   Low Volatility Expected        Full Employment Change (Feb)
42  08:30     CAD   Low Volatility Expected        Part Time Employment Change (Feb)
43  08:30     CAD   Low Volatility Expected        Participation Rate (Feb)
44  08:30     CAD   Moderate Volatility Expected   Unemployment Rate (Feb)

I don't know how to iterate vertically over the Time column to apply the convert_string_to_time(str) function to get only those rows that match with the desired time interval and store them in a new df called sub_df, may I get some assistance here?

CodePudding user response:

It's actually simpler than you think. Just use pd.to_datetime to convert the times to datetime objects, and then use pd.Series.between

mask = pd.to_datetime(df['Time']).between('4:30', '8:59')
filtered = df[mask]

Output:

>>> filtered
     Time Currency           Volatility expected                                          Event
24  04:30      GBP       Low Volatility Expected                         Inflation Expectations
25  05:00      EUR      High Volatility Expected                            EU Leaders Summit  
26  05:10      EUR       Low Volatility Expected                    Italian 15-Year BTP Auction
27  05:10      EUR       Low Volatility Expected                     Italian 3-Year BTP Auction
28  05:10      EUR       Low Volatility Expected                     Italian 7-Year BTP Auction
29  06:00      EUR       Low Volatility Expected                    Spanish Consumer Confidence
30  06:30      INR       Low Volatility Expected                               Bank Loan Growth
31  06:30      INR       Low Volatility Expected                                 Deposit Growth
32  06:30      INR       Low Volatility Expected                               FX Reserves, USD
33  07:00      INR       Low Volatility Expected         Cumulative Industrial Production (Jan)
34  07:00      INR       Low Volatility Expected              Industrial Production (YoY) (Jan)
35  07:00      INR       Low Volatility Expected               Manufacturing Output (MoM) (Jan)
36  07:00      BRL  Moderate Volatility Expected                                CPI (YoY) (Feb)
37  07:00      BRL  Moderate Volatility Expected                                CPI (MoM) (Feb)
38  08:06      BRL  Moderate Volatility Expected  Brazilian IPCA Inflation Index SA (MoM) (Feb)
39  08:30      CAD       Low Volatility Expected                 Capacity Utilization Rate (Q4)
40  08:30      CAD      High Volatility Expected                        Employment Change (Feb)
41  08:30      CAD       Low Volatility Expected                   Full Employment Change (Feb)
42  08:30      CAD       Low Volatility Expected              Part Time Employment Change (Feb)
43  08:30      CAD       Low Volatility Expected                       Participation Rate (Feb)
44  08:30      CAD  Moderate Volatility Expected                        Unemployment Rate (Feb)
  • Related