i have merged two dataframes and i want to combine the rows that have duplicate values for the location column, but combine the values for the performances column all while keeping the latitude and longitude values. How could i do this?
CodePudding user response:
It depends on what exactly you want to do with the performances column. You said you want to combine the values? I will assume you meant "sum up" the values.
If you have a df like so:
You will want to group by location, lat & lng. and sum up the performances.
df.groupby(['location', 'lat', 'lng']).performances.sum().reset_index()
CodePudding user response:
Group by columns of interest (location
, latitude
, longitude
) and take the sum:
df.groupby(['location', 'latitude', 'longitude']).sum()
Follow that up with .reset_index()
if you don't want to keep the grouping columns in the index:
df.groupby(['location', 'latitude', 'longitude']).sum().reset_index()