Home > Enterprise >  Pandas: after dropping value is still present
Pandas: after dropping value is still present

Time:11-16

I tried to drop value and plot the countplot but values are still there. What am I doing wrong?

df = df.drop(df[(df['market_segment'] == 'Undefined') & (df['market_segment'] == 'Aviation')].index)
plt.figure(figsize=(12,8))
sns.countplot(x='market_segment',data=df,hue='hotel')
plt.show()

The chart I'm getting

CodePudding user response:

There are 2 reasons this may be happening.

  1. Your first line where you are filtering is incorrect

  2. Your "market_segment" column may be a categorical dtype. In a categorical dtype Series, values that are not observed in the data can be propagated into seaborn, so converting to an object or string dtype can remedy this issue.

df = (
  df.loc[~df["market_segment"].isin(["Undefined", "Aviation"])]
  .astype({"market_segment": str})
)

plt.figure(figsize=(12,8))
sns.countplot(x='market_segment',data=df,hue='hotel')
plt.show()

CodePudding user response:

The problem is that you're removing rows where market_segment is both Undefined and Aviation. That, obviously, is nonsense logic.

Change your AND (&) to OR (|):

df = df.drop(df[(df['market_segment'] == 'Undefined') | (df['market_segment'] == 'Aviation')].index)
#                                                     ^ changed from & to |

That way, all rows will be dropped where market_segment is either Undefined or Aviation. If it's one of those, it will be remove.

  • Related