Home > OS >  Sum different rows in a data frame based on multiple conditions
Sum different rows in a data frame based on multiple conditions

Time:08-17

I have the following data frame:

dataset = {
     'String': ["ABCDEF","HIJABC","ABCHIJ","DEFABC"],
    'Bool':[True,True,False,False],
    'Number':[10,20,40,50]}

df = pd.DataFrame(dataset)



      String   Bool  Number
0  ABCDEF   True      10
1  HIJABC   True      20
2  ABCHIJ  False      40
3  DEFABC  False      50

I would like to sum the rows of the column Number where Bool is False to the rows where Bool is True:

The rows can be matched and summed together if the reverse of String of one row is not equal to the String of the row.

  • In this case ABCHIJ where bool is False is not equal to the reverse of ABCDEF so I sum the numbers: 10 40.

  • HIJABC where Bool is True is summed to DEFABC where Bool is False the outcome is 70

Expected output:

    String   Bool  Number
0  ABCDEF   True      50
1  HIJABC   True      70
2  ABCHIJ  False      40
3  DEFABC  False      50

I hope my explanation was good enough, is there a way to achieve the above outcome ?

CodePudding user response:

One way is like this:

df_true = df[df['Bool'] == True]
df_false = df[df['Bool'] == False]

for i in df_false['String']:
   idx = df_true[df_true['String'] != (i[3:]   i[:3]) ].index[0]
   current_num = df.loc[df.index == idx, 'Number'].values[0]
   added_num = df[df['String'] == i]['Number'].values[0]
   df.loc[df.index == idx, 'Number'] =  current_num   added_num

I hope it helps

  • Related