I'm trying to merge a specific column if the rows are similar, for example here "l.instagram.com" and "instagram.com" is actually the same source so I would like to merge activeUsers into instagram.com.
Give:
sessionSource dateRange activeUsers
0 snapchat.com previous 1
1 snapchat.com current 1
2 l.instagram.com previous 71
3 l.instagram.com current 23
4 instagram.com previous 5
5 instagram.com current 0
Each sessionSource has a row for "current" and "previous" period. But I want to merge l.instagram.com into instagram.com activeUsers since they are from the same source.
The desired result would look like this:
sessionSource dateRange activeUsers
0 snapchat.com previous 1
1 snapchat.com current 1
4 instagram.com previous 76
5 instagram.com current 23
I have tried few answers but I couldn't get to that result.
Thank you for your help.
CodePudding user response:
Replace the value l.instagram.com'with instagram.com: df['sessionSource']=df['sessionSource'].replace('l.instagram.com','instagram.com')
And then group by the columns 'sessionSource' & 'dataRange' and sum 'activeUsers':
sum_df = df.groupby(['sessionSource','dataRange']).agg({'activeUsers': 'sum'})
sum_df=sum_df.reset_index()
sum_df
This sum_df will give what you want. Hope this helps.
(image attached of the solution and output)