I am learning how to use for loops with if statements, can someone tell why python is only reading the if statement and not the elif statements.
mydf = [['house',7,5, np.nan],['block',30,25,19],['else',20, np.nan, np.nan]]
mydf= pd.DataFrame(mydf, columns=['Thing','height1','height2','height3'])
I am trying to create a loop that will go each line and first check: if index in height3 is NOT NaN then put that value in bottomHeight. Else if index in height2 is NOT NaN then put that value in bottomHeight. Or else put NaN in bottomHeight.
#Create a new column of NaN's
mydf["bottomHeight"]=(np.nan)*len(mydf)
for index in range(len(mydf)):
if mydf.loc[index,'height3'] != np.nan:
mydf.loc[index,'bottomHeight'] = mydf.loc[index, 'height3']
elif mydf.loc[index,'height2'] != np.nan:
mydf.loc[index,"bottomHeight"] = mydf.loc[index, 'height2']
else:
mydf.loc[np.nan,'bottomHeight'] = np.nan
The result should be bottomHeight = [5.0, 19.0, NaN], but it's not. The result is [NaN, 19.0, NaN]. Like it's only reading the first if statement.
CodePudding user response:
You can avoid loops in pandas, because here exist vectorized alternatives - simpliest is Series.fillna
:
mydf['bottomHeight'] = mydf['height3'].fillna(mydf['height2'])
print (mydf)
Thing height1 height2 height3 bottomHeight
0 house 7 5.0 NaN 5.0
1 block 30 25.0 19.0 19.0
2 else 20 NaN NaN NaN
Or forward non missing values per rows in selected columns by ffill(axis=1)
and select last column by position:
mydf['bottomHeight'] = mydf[['height2','height3']].ffill(axis=1).iloc[:, -1]
print (mydf)
Thing height1 height2 height3 bottomHeight
0 house 7 5.0 NaN 5.0
1 block 30 25.0 19.0 19.0
2 else 20 NaN NaN NaN
Your solution is possible if use notna
for test non missing values:
for index in range(len(mydf)):
if pd.notna(mydf.loc[index,'height3']):
mydf.loc[index,'bottomHeight'] = mydf.loc[index, 'height3']
elif pd.notna(mydf.loc[index,'height2']):
mydf.loc[index,"bottomHeight"] = mydf.loc[index, 'height2']
else:
mydf.loc[index,'bottomHeight'] = np.nan
print (mydf)
Thing height1 height2 height3 bottomHeight
0 house 7 5.0 NaN 5.0
1 block 30 25.0 19.0 19.0
2 else 20 NaN NaN NaN
--
If need processing all heigth
columns:
mydf['bottomHeight'] = mydf.filter(like='height').ffill(axis=1).iloc[:, -1]
print (mydf)
Thing height1 height2 height3 bottomHeight
0 house 7 5.0 NaN 5.0
1 block 30 25.0 19.0 19.0
2 else 20 NaN NaN 20.0