Input df
ID Date TAVG TMAX TMIN
1 01-01-2020 26 21
2 01-01-2020 15 16
3 01-01-2020 25 29 18
1 02-01-2020 16 16
2 02-01-2020 26 20
.....
The code I am using
for index, row in df.iterrows():
if [(row["TMIN"].isnull()) & (row["TAVG"].notnull()) & (row["TMAX"].notnull())]:
row["TMIN"] = (2 * row["TAVG"]) - row["TMAX"]
if [(row["TMAX"].isnull()) & (row["TMIN"].notnull()) & (row["TAVG"].notnull())]:
row["TMAX"] = (2 * row["TAVG"]) - row["TMIN"]
if [(row["TAVG"].isnull()) & (row["TMIN"].notnull()) & (row["TMAX"].notnull())]:
row["TAVG"] = (row["TMIN"] row["TMAX"]) / 2
When I run this, I get the below error:
if [(row["TMIN"].isnull()) & (row["TAVG"].notnull()) & (row["TMAX"].notnull())]:
AttributeError: 'float' object has no attribute 'isnull'
How to fix this? Any alternate way to achieve the same result?
CodePudding user response:
.isnull()
and .notnull()
work on series/columns (or even dataframes. You're accessing an element of a row, that is, a single element (which happens to be a float). That causes the error.
For a lot of cases in Pandas, you shouldn't iterate over the rows individually: work column-wise instead, and skip the loop.
Your particular issue could be translated to be, column-wise:
sel = df['TMIN'].isnull() & df['TAVG'].notnull() & df['TMAX'].notnull()
df.loc[sel, 'TMIN'] = df.loc[sel, 'TAVG'] * 2 - df.loc[sel, 'TMAX']
and similar for the other two columns. All without any iterrows()
or other loop.
However, since you are apparently trying to replace NaNs/null values with values from other columns, you can use .fillna()
here:
df['TMIN'].fillna(df['TAVG'] * 2 - df['TMAX'], inplace=True)
or if you don't like inplace
(because you don't want to change the original dataframe, or want to use the result directly in a chain computation):
df['tmin2'] = df['TMIN'].fillna(df['TAVG'] * 2 - df['TMAX'])
and for the other two columns:
df['tmax2'] = 2 * df['TAVG'] - df['TMIN']
df['tavg2'] = (df['TAVG'] df['TMIN'])/2
You may ask what happens in a TMIN cell is null, and either the TAVG or TMAX value, or both, is null. In that case, you'd be replacing the null value with null, so nothing happens. Which, given your original if
statement, would also be the case in your original code.