Home > Mobile >  Used dropna(subset) but an error occurred
Used dropna(subset) but an error occurred

Time:07-13

I'm practicing on data preprocessing using dropna method

I simply defined csv_data as

csv_data = \
'''A, B, C, D
1.0, 2.0, 3.0, 4.0
5.0, 6.0,, 8.0
10.0, 11.0, 12.0,'''

df = pd.read_csv(StringIO(csv_data))

And I tried df.dropna(subset=['C']) for dropping rows where NaN appear in 'C' column

But I got an error below.

df.dropna(subset=['C'])
Traceback (most recent call last):

Input In [50] in <cell line: 1>
df.dropna(subset=['C'])

File C:\Anaconda3\lib\site-packages\pandas\util\_decorators.py:311 in wrapper
return func(*args, **kwargs)

File C:\Anaconda3\lib\site-packages\pandas\core\frame.py:6002 in dropna
raise KeyError(np.array(subset)[check].tolist())

KeyError: ['C']

Anyone experienced this error?

CodePudding user response:

Seems like your columns name contains whitespace which needs to be striped before performing dropna. So if you check your current column names you could see this,

>>> df.columns
Index(['A', ' B', ' C', ' D'], dtype='object')
                  ^^^

So one approach is to remove the spaces from column names.

df.columns = df.columns.str.strip()

Alternatively you can pass the exact column name(including spaces)

df.dropna(subset=[' C'])
                  ^^^^
  • Related