I have a pandas dataframe
like:
Reporter Partner Element Value
'A' 'B' 'import' a
'A' 'C' 'import' 0
'B' 'A' 'export' c
'B' 'C' 'export' d
'C' 'A' 'export' e
I want to compare those rows that one's reporter is the other's partner (one's element is 'import' and the other's is 'export') and combine the two rows to one row by applying a function to them. e.g. :
def combine(a, b):
if a['value']==0:
return b
if b['value']==0:
return a
if a['value'] == b['value']*(1.12):
return a
return b
How can I find those rows that function should be applied in pandas
?
the desired output is something like:
Reporter Partner Element Value
'A' 'B' 'import' a c
'A' 'C' 'import' e
'C' 'B' 'import' d
CodePudding user response:
I suppose you can make a joint index of reporter-partner:
df['Index'] = df[['Reporter', 'Partner']].apply(lambda x: '_'.join(sorted(x)), axis=1)
We make a sorted index where A and B, or B and A have the same value.
Then you can apply groupby
function and make further computations.
# | Reporter | Partner | Element | Value | Index |
---|---|---|---|---|---|
0 | A | B | import | a | A_B |
1 | A | C | import | 0 | A_C |
2 | B | A | export | c | A_B |
3 | B | B | export | d | B_B |
Feel free to ask more and I would be happy to help!
CodePudding user response:
I’m not sure I fully understand your question but you can use this sort of notation b[a['value']==0]
Which would result in a column of b
values which meet the condition a['value']==0