Check if unordered pair of two columns values is the same and reduce (directional sum) using groupby-CodePudding

I have a dataframe df:

Source	Dest	Value
A	B	10
A	B	7
B	A	6
A	C	12
A	C	5
D	E	5
E	D	6
F	G	1

I want to get the 'directional sum' of the values for the (source, dest) tuple. (i.e. combine all the X->Y and Y->X entries and take the weighted sum for the value column with weights equal to 1 and -1 respectively)

df_out:

Source	Dest	Value
A	B	11
A	C	17
D	E	-1
F	G	1

CodePudding user response：

We need np.sort then can trim the value with the diff

df1 = df.copy()
l = ['Source','Dest']
df1[l] = np.sort(df1[l].values,axis=1)
df1.loc[df1[l].ne(df[l]).all(1),'Value'] *= (-1)
df1 = df1.groupby(l)['Value'].sum().reset_index()
df1
Out[83]: 
  Source Dest  Value
0      A    B     11
1      A    C     17
2      D    E     -1
3      F    G      1

CodePudding user response：

You could groupby the sorted values, apply a -1 factor if the direction changed, and sum:

group = df[['Source', 'Dest']].apply(sorted, axis=1, result_type='expand')

(df.assign(Value=df['Value'].where(group[0].eq(df['Source']), -df['Value']))
   .groupby([group[0], group[1]])
   ['Value'].sum()
   .reset_index()
   .rename(columns={0: 'Source', 1:'Value'})
)

output:

  Source  Value  Value
0     A      B      11
1     A      C      17
2     D      E      -1
3     F      G       1