I am trying to update my dataframe column value based on the condition but when i check my data frame its value is not getting updated.
for i in titanic['Survived'].unique():
meanAge = titanic.Age[titanic['Survived'] == i].mean()
meanAge = "{:.1f}".format(meanAge)
df = titanic['Survived'] == i
df1 = titanic.Age[df];
df1.fillna(meanAge, inplace=True)
#print (df1) the value seems to be updated here
but print(titanic still shows NAN values.
CodePudding user response:
The reason is that most likely df1
is a copied object from the dataframe, and it does not reference the titanic
dataframe.
It will probably help you to do like below (just replace the condition with one that includes NaN values). So instead of calling the method fillna
just use the assignment operator with the proper index.
titanic.Age[titanic['Survived'] == i] = meanAge
If you don't have memory constraints is better to think about dataframes as immutable.
Instead of updating in place, try to make a copy of the original and update the new one.
CodePudding user response:
With the exception of shallow copy operations, and operations with "inplace" as an argument, all operations generate a copy.
You can directly update the titanic DataFrame after:
for i in titanic['Survived'].unique():
meanAge = titanic.Age[titanic['Survived'] == i].mean()
meanAge = "{:.1f}".format(meanAge)
df = titanic['Survived'] == i
df1 = titanic.Age[df];
df1.fillna(meanAge, inplace=True)
titanic.loc[df, 'Age'] = df1