Home > Software design >  Pandas Dataframe return rows where state, city, and date occur multiple times
Pandas Dataframe return rows where state, city, and date occur multiple times

Time:05-22

Firstly, this is my first post, so my apologies if it is formatted poorly.

So I have this dataframe which I have attached a picture of. It contains UFO sightings and I want to return the rows where the if the city and state are the same and then also if the dates are the same. I am trying to find sightings that occurred on the same day in the same city and state. Please let me know if more info is required. Thank you in advance!

enter image description here

CodePudding user response:

Try this.

# Create a column converting date_time to just date
df['date'] = pd.to_datetime(df['occurred_date_time']).dt.normalize()

# groupby and count times where date, city and state 
# then create boolean series where count is greater than 1
m = df.groupby(['date','city','state']).count() > 1

# boolean filter the dataframe rows with that series, m.
df[m]

CodePudding user response:

Alternative, boolean indexing to keep the duplicated rows:

df['date'] = pd.to_datetime(df['occurred_date_time']).dt.normalize()

df2 = df[df.duplicated(['date','city','state'], keep=False)]

If you don't want the new column:

df2 = df[df.assign(date=pd.to_datetime(df['occurred_date_time'])
                          .dt.normalize())
           .duplicated(['date','city','state'], keep=False)]
  • Related