I am trying to write a loop through a pandas dataframe that takes the mean (ignoring NaN's) of specific columns (16:20) in the current row and appends it to a list (that I later want to make a new column in my dataframe). My code is as below:
import numpy as np
n = 0
list = []
for row in df:
list.append(
np.nanmean(
df.iloc[n, 16:20]
)
)
n = 1
len(list)
>>> 87
len(df)
>>> 20434
As you can see, the for loop stops after 86 loops - why does it stop? Shouldn't I receive a list that has 20434 entries?
CodePudding user response:
You have to work off of the index:
list = []
for index in df.index:
list.append(
np.nanmean(
pd.to_numeric(df.iloc[index, 16:20])
)
)
n = 1
len(list)
>>> 20434
CodePudding user response:
Use for n, row in enumerate(df):
but it's clearly not the best solution!
Prefer:
out = df.iloc[:, 16:20].mean(axis=1) # Remember 20 is excluded in Python