Home > Back-end >  Pandas dataframe has zero elements after using dropna()
Pandas dataframe has zero elements after using dropna()

Time:11-11

My dataframe has zero elements after I use dropna() on a 2-dimensional array:

excel

data = pd.read_excel('/file.xlsx', sheet_name='Sheet1', engine='openpyxl').iloc[0:, 0:].astype(float).dropna().values.flatten()

data
array([], dtype=float64)

However dropna() works perfectly fine on a 1-dimensional array and the NaNs get cleared out.

What could be something wrong I'm doing?

CodePudding user response:

By default the .dropna() removes the entire selected axis, speaking of axis, the default axis is also set to be rows. Maybe that's not what you want?

If you then in each rows have a NaN value, then all of the rows is going to be droped.

To fix this you can specify it like this

.dropna(axis=1)

CodePudding user response:

The dropna function defaults to axis=0, as per the documentation:

Determine if rows or columns which contain missing values are removed.

  • 0, or ‘index’ : Drop rows which contain missing values.
  • 1, or ‘columns’ : Drop columns which contain missing value.

So, your code becomes:

data = pd.read_excel('/file.xlsx', sheet_name='Sheet1', engine='openpyxl').iloc[0:, 0:].astype(float).dropna(axis=1).values.flatten()
  • Related