Home > Software design >  Dropping same rows in two pandas dataframe in python
Dropping same rows in two pandas dataframe in python

Time:12-06

I want to have uncommon rows in two pandas dataframes. Two dataframes are df1 and wildone_df. When I check their typy both of them are "pandas.core.frame.DataFrame" but when I use below mentioned code to omit their intersection:

o = pd.concat([wildone_df,df1]).drop_duplicates(subset=None, keep='first', inplace=False)

I face following error:

TypeError                                 Traceback (most recent call last)
<ipython-input-36-4e158c0eeb97> in <module>
----> 1 o = pd.concat([wildone_df,df1]).drop_duplicates(subset=None, keep='first', inplace=False)

5 frames
/usr/local/lib/python3.8/dist-packages/pandas/core/algorithms.py in factorize_array(values, na_sentinel, size_hint, na_value, mask)
    561 
    562     table = hash_klass(size_hint or len(values))
--> 563     uniques, codes = table.factorize(
    564         values, na_sentinel=na_sentinel, na_value=na_value, mask=mask
    565     )

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.factorize()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable._unique()

**TypeError: unhashable type: 'numpy.ndarray'**

How can I solve this issue?!

Omitting the intersection of two dataframes

CodePudding user response:

try this:

merged_df = merged_df.loc[:,~merged_df.columns.duplicated()].copy()

See this post for more info

CodePudding user response:

Either use inplace=True or re-assign your dataframe when using pandas.DataFrame.drop_duplicates or any other built-in function that has an inplace parameter. You can't use them both at the same time.

Returns (DataFrame or None)
DataFrame with duplicates removed or None if inplace=True.

Try this :

o = pd.concat([wildone_df, df1]).drop_duplicates() #keep="first" by default
  • Related