Counting weight of unique combinations within groups-CodePudding

I have the following dataframe:

    Group  from  to
     1      2    1
     1      1    2  
     1      3    2 
     1      3    1 
     2      1    4 
     2      3    1
     2      1    2
     2      3    1

I want create a 4th column that counts the of unique combinations (from, to) within each group and drops any repeated combination within each group (leaves only one)

Expected output:

    Group  from  to weight
     1      2    1     1
     1      1    2     1
     1      3    2     1
     1      3    1     1
     2      1    4     1
     2      3    1     2
     2      1    2     1

In the expected output, the 2nd from 3, to 1 row in group 2 was dropped because it is a duplicate.

CodePudding user response：

In your case we just need groupby with size

out = df.groupby(df.columns.tolist()).size().to_frame(name='weight').reset_index()
Out[258]: 
   Group  from  to  weight
0      1     1   2       1
1      1     2   1       1
2      1     3   1       1
3      1     3   2       1
4      2     1   2       1
5      2     1   4       1
6      2     3   1       2

CodePudding user response：

You can group by the 3 columns using .groupby() and take their size by GroupBy.size(), as follows:

df_out = df.groupby(['Group', 'from', 'to'], sort=False).size().reset_index(name='weight')

Result:

print(df_out)

   Group  from  to  weight
0      1     2   1       1
1      1     1   2       1
2      1     3   2       1
3      1     3   1       1
4      2     1   4       1
5      2     3   1       2
6      2     1   2       1