Used dropna(subset) but an error occurred-CodePudding

I'm practicing on data preprocessing using dropna method

I simply defined csv_data as

csv_data = \
'''A, B, C, D
1.0, 2.0, 3.0, 4.0
5.0, 6.0,, 8.0
10.0, 11.0, 12.0,'''

df = pd.read_csv(StringIO(csv_data))

And I tried df.dropna(subset=['C']) for dropping rows where NaN appear in 'C' column

But I got an error below.

df.dropna(subset=['C'])
Traceback (most recent call last):

Input In [50] in <cell line: 1>
df.dropna(subset=['C'])

File C:\Anaconda3\lib\site-packages\pandas\util\_decorators.py:311 in wrapper
return func(*args, **kwargs)

File C:\Anaconda3\lib\site-packages\pandas\core\frame.py:6002 in dropna
raise KeyError(np.array(subset)[check].tolist())

KeyError: ['C']

Anyone experienced this error?

CodePudding user response：

Seems like your columns name contains whitespace which needs to be striped before performing dropna. So if you check your current column names you could see this,

>>> df.columns
Index(['A', ' B', ' C', ' D'], dtype='object')
                  ^^^

So one approach is to remove the spaces from column names.

df.columns = df.columns.str.strip()

Alternatively you can pass the exact column name(including spaces)

df.dropna(subset=[' C'])
                  ^^^^