Let's say I have a data frame like this:
import pandas as pd
data1 = {
"date": [1, 2, 3],
"height": [420.3242, 380.1, 390],
"height_new": [300, 380.1, "nan"],
"duration": [50, 40, 45],
"feeling" : ["great","good","great"]
}
df = pd.DataFrame(data1)
And I want to update the "height" column with the "height_new" column but not when the value for "height_new" is "nan". Any hints on how to do this in a Pythonic manner?
I have a rough code which gets the job done but feels clunky (too many lines of code).
for x, y in zip(df['height'], df['height_new']) :
if y != 'nan':
df['height'].replace(x, y, inplace= True)
x = y
CodePudding user response:
You can use pandas.Series.where
with pandas.Series.notna
:
df["height"] = df["height_new"].where(df["height_new"].notna(), df["height"])
# Output :
print(df)
date height height_new duration feeling
0 1 300.0 300.0 50 great
1 2 380.1 380.1 40 good
2 3 390.0 NaN 45 great
NB : If "nan"
is a literal string, use this instead :
df["height"] = df["height_new"].where(df["height_new"].ne("nan"), df["height"])