Home > front end >  How to ensure for loop loops through entire dataframe?
How to ensure for loop loops through entire dataframe?

Time:06-08

I am trying to write a loop through a pandas dataframe that takes the mean (ignoring NaN's) of specific columns (16:20) in the current row and appends it to a list (that I later want to make a new column in my dataframe). My code is as below:

import numpy as np


n = 0
list = []
for row in df:
    list.append(
                np.nanmean(
                           df.iloc[n, 16:20]
                                            )
                )
    n  = 1

len(list)
>>> 87

len(df)
>>> 20434

As you can see, the for loop stops after 86 loops - why does it stop? Shouldn't I receive a list that has 20434 entries?

CodePudding user response:

You have to work off of the index:

list = []
for index in df.index:
    list.append(
                np.nanmean(
                           pd.to_numeric(df.iloc[index, 16:20])
                           )
                )
    n  = 1

len(list)
>>> 20434

CodePudding user response:

Use for n, row in enumerate(df): but it's clearly not the best solution!

Prefer:

out = df.iloc[:, 16:20].mean(axis=1)  # Remember 20 is excluded in Python
  • Related