I have two dataframes. I want to drop the values in first dataframe (default) after comparing with second dataframe (provided by user)
def_df = pd.DataFrame([['alpha','beta'],['gamma','delta']],index=['ab_plot',gd_plot])
0 1
ab_plot alpha beta
gd_plot gamma delta
rk_plot ray kite
user_df = pd.DataFrame([10,20],index=['alpha','beta'])
0
alpha 10
beta 20
I want to compare two dataframes and know the possible plots for given user data.
Expected answer
['ab_plot'] # since user has provided data for `'alpha','beta'`
My approach:
posble_plots_with_user_data = [True for x in posble_plots.values if x in df.columns]
Present answer:
TypeError: unhashable type: 'numpy.ndarray'
CodePudding user response:
If need test all values if match at least one value by index from user_df
use DataFrame.isin
with DataFrame.any
and filter def_df.index
:
#changed data
def_df = pd.DataFrame([['alpha','beta'],['gamma','beta']],index=['ab_plot','gd_plot'])
user_df = pd.DataFrame([10,20],index=['alpha','beta'])
posble_plots_with_user_data = def_df.index[def_df.isin(user_df.index).any(axis=1)].tolist()
print (posble_plots_with_user_data)
['ab_plot', 'gd_plot']
If need rows with match all values per rows use DataFrame.all
:
posble_plots_with_user_data = def_df.index[def_df.isin(user_df.index).all(axis=1)].tolist()
print (posble_plots_with_user_data)
['ab_plot']
Details:
print (def_df.isin(user_df.index))
0 1
ab_plot True True
gd_plot False True