I am working with a csv file that has a dataset of 13 years worth of 5m time intervals.
I am trying to slice sections of this dataset into specific time periods.
example
time_period = (df['time'] >= '01:00:00') & (df['time']<='5:00:00')
time_period_df = df.loc[time_period]
I would expect an output of only the time between 1-5 to be included in this time period, however, I am getting all 24hrs in the output
I would like the output to print only time in between and including 1:00:00 and 5:00:00.
CodePudding user response:
It looks like you are using the comparison operators >= and <= to try and specify the time range you want to include in your time period dataframe. However, these comparison operators will not work as expected on string values like the ones you have in your time column. Instead of using these operators, you can use the str.slice() method to extract the hour portion of the time strings and then use the comparison operators on those numeric values to specify your time range.
Here is an example of how you could do this:
# First, extract the hour portion of the time strings
df['hour'] = df['time'].str.slice(0, 2)
# Next, create a boolean mask using the comparison operators on the 'hour' column
time_period = (df['hour'] >= '01') & (df['hour'] <= '05')
# Finally, use this boolean mask to create your time period dataframe
time_period_df = df.loc[time_period]
This should give you a dataframe that includes only the rows with time values between and including 1:00:00 and 5:00:00.
Note that this solution assumes that the time strings in your time column are in the format 'HH:MM:SS'. If the time strings are in a different format, you will need to adjust the str.slice() call accordingly.