Home > database >  Merge rows based on 2 field match
Merge rows based on 2 field match

Time:11-26

I'm trying to merge a specific column if the rows are similar, for example here "l.instagram.com" and "instagram.com" is actually the same source so I would like to merge activeUsers into instagram.com.

Give:

     sessionSource dateRange  activeUsers  
0     snapchat.com  previous            1
1     snapchat.com   current            1
2  l.instagram.com  previous           71
3  l.instagram.com   current           23
4    instagram.com  previous            5
5    instagram.com   current            0

Each sessionSource has a row for "current" and "previous" period. But I want to merge l.instagram.com into instagram.com activeUsers since they are from the same source.

The desired result would look like this:

     sessionSource dateRange  activeUsers  
0     snapchat.com  previous            1
1     snapchat.com   current            1
4    instagram.com  previous           76
5    instagram.com   current           23

I have tried few answers but I couldn't get to that result.

Thank you for your help.

CodePudding user response:

Replace the value l.instagram.com'with instagram.com: df['sessionSource']=df['sessionSource'].replace('l.instagram.com','instagram.com')

And then group by the columns 'sessionSource' & 'dataRange' and sum 'activeUsers':

sum_df = df.groupby(['sessionSource','dataRange']).agg({'activeUsers': 'sum'})


sum_df=sum_df.reset_index()
sum_df

This sum_df will give what you want. Hope this helps.

(image attached of the solution and output)

Screenshot

  • Related