Home > Enterprise >  Pandas finding identical coulmns between twe dataframes
Pandas finding identical coulmns between twe dataframes

Time:12-22

I have to dataframes with different indexes and some shard coulmns (coulmn name all values locations). e.g.:

data1 = {"col1": ["A", "B", "C", "D"], "col2": ["E", "F", "G", "H"], "col3": ["Z", "Z", "Z", "Z"], "col4": ["Q", "Q", "Q", "Q"]}
df1 = pd.DataFrame(index=[0,1,2,3], data=data1)
data2 = {"col1": ["A", "B", "C", "D"], "col2": ["E", "F", "G", "H"], "col3": ["Z", "Z", "Z", "X"]}
df2 = pd.DataFrame(index=[4,5,6,7], data=data2)

df1

  col1 col2 col3 col4
0   A   E    Z    Q
1   B   F    Z    Q
2   C   G    Z    Q
3   D   H    Z    Q

df2

  col1 col2 col3
4   A   E    Z
5   B   F    Z
6   C   G    Z
7   D   H    X

I want the output to be only the identical coulmns (name values location), even though indexing is different. In my example, It should be col1, col2.

CodePudding user response:

df1.loc[:, df1.eq(df2.set_axis(df1.index)).all()]

output:

  col1  col2
0   A   E
1   B   F
2   C   G
3   D   H

CodePudding user response:

First you get columns that are common by using intersection and then see if their values are all equal.

cols = df1.columns.intersection(df2.columns)
out = [col for col in cols if (df1[col].values == df2[col].values).all()]

print(out)

['col1', 'col2']
  • Related