Home > Enterprise >  Check if unordered pair of two columns values is the same and reduce (directional sum) using groupby
Check if unordered pair of two columns values is the same and reduce (directional sum) using groupby

Time:10-18

I have a dataframe df:

Source Dest Value
A B 10
A B 7
B A 6
A C 12
A C 5
D E 5
E D 6
F G 1

I want to get the 'directional sum' of the values for the (source, dest) tuple. (i.e. combine all the X->Y and Y->X entries and take the weighted sum for the value column with weights equal to 1 and -1 respectively)

df_out:

Source Dest Value
A B 11
A C 17
D E -1
F G 1

CodePudding user response:

We need np.sort then can trim the value with the diff

df1 = df.copy()
l = ['Source','Dest']
df1[l] = np.sort(df1[l].values,axis=1)
df1.loc[df1[l].ne(df[l]).all(1),'Value'] *= (-1)
df1 = df1.groupby(l)['Value'].sum().reset_index()
df1
Out[83]: 
  Source Dest  Value
0      A    B     11
1      A    C     17
2      D    E     -1
3      F    G      1

CodePudding user response:

You could groupby the sorted values, apply a -1 factor if the direction changed, and sum:

group = df[['Source', 'Dest']].apply(sorted, axis=1, result_type='expand')

(df.assign(Value=df['Value'].where(group[0].eq(df['Source']), -df['Value']))
   .groupby([group[0], group[1]])
   ['Value'].sum()
   .reset_index()
   .rename(columns={0: 'Source', 1:'Value'})
)

output:

  Source  Value  Value
0     A      B      11
1     A      C      17
2     D      E      -1
3     F      G       1
  • Related