I have two similar dataframes like this
Dataframe 1:
ID classification
1 MISS
2 MISS
3 CORRECT
4 MISS
5 CORRECT
Dataframe 2:
ID classification
1 CORRECT
2 CORRECT
3 MISS
4 MISS
5 CORRECT
I would like get the index numbers for each time there is a mismatch between values in the classification column between dataset 1 and dataset 2. The datasets are of similar length and the remaining columns are also equal to each other.
CodePudding user response:
Because same number of rows and same indices you can compare classification
between both DataFrames for not equal by Series.ne
and filter values in boolean indexing
:
#ID is index
df1.index[df1['classification'].ne(df2['classification'])]
Or if ID
in column:
df1.loc[df1['classification'].ne(df2['classification']), 'ID']
If not same number of rows use Series.map
, here ID
is column:
s = df2.set_index('ID')['classification']
df1.loc[df1['classification'].ne(df1['ID'].map(s)), 'ID']