My requirement is to check whether all the rows in one dataframe present on another. Here I have two dataframe shown as below:
dfActual
dfExpected
Here i want to check whether all the rows in dfExpected present on dfActual, if all present it should return true else false. I have tried the below solution:
dfActual['Bool_col'] = (dfActual.merge(dfExpected,
how='left',
indicator=True)
.eval('_merge == "both"'))
But it always returns false. Could anyone please help me on this.
CodePudding user response:
It is working for me correctly you can see in my attached image.Kindly check your DataFrame making mistake any where else
You comment that column then also result is ok see this image.
CodePudding user response:
You can do:
actual_list = dfActual.to_numpy().tolist()
expected_list = dfExpected.to_numpy().tolist()
all_rows_present = np.all([True if sub_list in actual_list else False
for sub_list in expected_list])
CodePudding user response:
Guess No. 1: You want a new row that identifies whether it occurs in another data frame. Then, you could do something like this:
import numpy as np
import pandas as pd
df = pd.DataFrame({
'NAME': ['AAA', 'BBB', 'CCC', 'CCC', 'DDD', 'AAA'],
'AGE': [24] * 6,
'LIMIT': [2] * 6
})
df_sub = pd.DataFrame({
'NAME': ['AAA', 'BBB'],
'AGE': [24] * 2,
'LIMIT': [2] * 2
})
df['check'] = np.isin(df.values, df_sub.values).all(axis=1)
df
------------------------------
NAME AGE LIMIT check
0 AAA 24 2 True
1 BBB 24 2 True
2 CCC 24 2 False
3 CCC 24 2 False
4 DDD 24 2 False
5 AAA 24 2 True
------------------------------
Guess No. 2: If you just want to check, whether one data frame occurs in another, you could use this:
df_sub.isin(df).all(axis=1).all()