Home > OS >  how can i get a boolean answer to see if the index of one dataframe contains all elements of columns
how can i get a boolean answer to see if the index of one dataframe contains all elements of columns

Time:09-30

I have subsetted a dataframe whose index is present as values (strings) in another dataframe as follows:

df = df1[df1.index.isin(df2['column_name'])]

this works without issue however the order of the index in the df is different to that of df2['column_name']

this is understandable and also fine as i dont care for the order of the new df. however as a sanity check I would like to be sure that the new dataframe indexes exactly match those of the column names in df2 (again, not order but just that the subsetting works correctly)

unfortunately, df.index.equals(df2['column_name') returns False as it expects the order to also be the same.

Is there a way of checking that values match without worrying about the order?

reproducible example:

df1 = pd.DataFrame(np.array([1,2,3,4,5,6]),index=['a', 'b', 'c', 'd', 'e', 'f'], columns=['values'])
df2 = pd.DataFrame(np.array(['a', 'b', 'c']), index=range(3), columns=['column_name'])

df = df1[df1.index.isin(df2['column_name'])]

thank you

CodePudding user response:

Test values for subsets - without ordering is possible by:

print (set(df2['column_name']).issubset(df1.index))
True

print (df2['column_name'].isin(df1.index).all())
True
  • Related