Firstly, this is my first post, so my apologies if it is formatted poorly.
So I have this dataframe which I have attached a picture of. It contains UFO sightings and I want to return the rows where the if the city and state are the same and then also if the dates are the same. I am trying to find sightings that occurred on the same day in the same city and state. Please let me know if more info is required. Thank you in advance!
CodePudding user response:
Try this.
# Create a column converting date_time to just date
df['date'] = pd.to_datetime(df['occurred_date_time']).dt.normalize()
# groupby and count times where date, city and state
# then create boolean series where count is greater than 1
m = df.groupby(['date','city','state']).count() > 1
# boolean filter the dataframe rows with that series, m.
df[m]
CodePudding user response:
Alternative, boolean indexing to keep the duplicated
rows:
df['date'] = pd.to_datetime(df['occurred_date_time']).dt.normalize()
df2 = df[df.duplicated(['date','city','state'], keep=False)]
If you don't want the new column:
df2 = df[df.assign(date=pd.to_datetime(df['occurred_date_time'])
.dt.normalize())
.duplicated(['date','city','state'], keep=False)]