I am trying to drop all columns that have specific rows (in a range by index) all empty. For example, in the following example table,
colA colB colC colD
rowA val val val val
rowB val val val
rowC val val
rowD val
I wish to drop all columns with just rowC to row D all empty, meaning drop colB and colD. Following is the line of code that I have currently:
df = df.dropna(subset=df.iloc[2:], axis=1, how="all")
I was attempting to use dropna using a subset of rows 3~. However, when I run the code the following KeyError occurs:
File "/Users/zia/Desktop/work/Automation/test.py", line 78, in CONVPVT
df = df.dropna(subset=df.iloc[2:45], axis=1, how="all")
File "/Users/zia/opt/anaconda3/lib/python3.9/site-packages/pandas/util/_decorators.py", line 311, in wrapper
return func(*args, **kwargs)
File "/Users/zia/opt/anaconda3/lib/python3.9/site-packages/pandas/core/frame.py", line 6002, in dropna
raise KeyError(np.array(subset)[check].tolist())
How can I fix this?
CodePudding user response:
subset
parameter accepts labels, such as the index, not the df itself. Try
# consider only 3rd row onwards for dropping
df.dropna(subset=df.index[2:], axis=1, how="all")