Comparing two Dataframes and creating a third one where certain contions are met-CodePudding

I am trying to compare two different dataframes that have same column names and indexes(not numerical) and I need to obtain a third df with the biggest value for the row with same column name.

example:

df1=

| | col_1 | col2 | col-3 |

| rft_12312 | 4 | 7 | 4 |

| rft_321321 | 3 | 4 | 1 |

df2=

| | col_1 | col2 | col-3 |

| rft_12312 | 7 | 3 | 4 |

| rft_321321 | 3 | 7 | 6 |

required result:

| | col_1 | col2 | col-3 |

| rft_12312 | 7 (because df2.value in this [row :column] >df1.value) | 7 | 4 |

| rft_321321 | 3(when they are equal doesn't matter from which column is the value) | 7 | 6 |

I've already tried pd.update with filter_func defined as:

def filtration_function(val1,val2):
    if val1 >= val2:
        return val1
    else:
        return val2

but is not working. I need the check for each column with same name. 


also **pd.compare** but does not allow me to pick the right values. 




Thank you in advance :)

CodePudding user response：

I think one possibility would be to use "combine". This method generates an element-wise comparsion between the two dataframes and returns the maximum value of each element.

Example:

import pandas as pd

def filtration_function(val1, val2):
    return max(val1, val2)

result = df1.combine(df2, filtration_function)

I think method "where" can work to:

import pandas as pd

result = df1.where(df1 >= df2, df2)