I've converted the time column with type 'str' to datetime64:
try:
heart_rate_seconds['time'] = pd.to_datetime(heart_rate_seconds['time'], format = "%m/%d/%Y %I:%M:%S %p")
except:
heart_rate_seconds['time'] = pd.to_datetime(heart_rate_seconds['time'], format = "%Y/%m/%d %H:%M:%S")
How do filter the dataframe so that it only contains rows where the time column has minutes and seconds equal to zero?
This is what I've tried:
type(heart_rate_seconds['time'][0])
This displays pandas._libs.tslibs.timestamps.Timestamp
test = heart_rate_seconds['time'][1].second
test
This works
heart_rate_hourly = heart_rate_seconds.loc[heart_rate_seconds['time'].dt.second() == 0
and heart_rate_seconds['time'].dt.minute() == 0]
This gets an error: TypeError: 'Series' object is not callable
heart_rate_hourly = heart_rate_seconds[heart_rate_seconds['time'].dt.strftime("%M") == 0
and heart_rate_seconds['time'].dt.strftime("%S") == 0]
This gets an error: ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
on_the_hour = []
for time in heart_rate_seconds['time']:
if time.second == 0 and time.minute == 0:
on_the_hour.append(time)
on_the_hour[:5]
This works
on_the_hour = []
for row in heart_rate_seconds:
time = row[1]
if time.second == 0 and time.minute == 0:
on_the_hour.append(time)
on_the_hour[:5]
This doesn't work - AttributeError: 'str' object has no attribute 'second'
Thanks
CodePudding user response:
you can use the dt
accessor to create masks for certain properties. Ex:
import pandas as pd
df = pd.DataFrame({'time': ['01/03/2021 03:30:00 AM', '2021/02/04 13:00:00'],
'values': [0, 1]})
# note that pandas automatically infers the format correctly:
df['time'] = pd.to_datetime(df['time'])
# a mask where min and sec are zero:
m = (df['time'].dt.minute == 0) & (df['time'].dt.second == 0)
print(df[m])
# time values
# 1 2021-02-04 13:00:00 1
Note that you do not have to assign the conditionals to a variable to be able to use them - and of course you can also use them within loc
.