Home > Software design >  Filling the NaN values in the column dataframe based on condition python pandas
Filling the NaN values in the column dataframe based on condition python pandas

Time:02-08

I am trying to update my dataframe column value based on the condition but when i check my data frame its value is not getting updated.

for i in titanic['Survived'].unique():
meanAge = titanic.Age[titanic['Survived'] == i].mean()
meanAge = "{:.1f}".format(meanAge)
df = titanic['Survived'] == i
df1 = titanic.Age[df];
df1.fillna(meanAge, inplace=True)
#print (df1) the value seems to be updated here

but print(titanic still shows NAN values.

CodePudding user response:

The reason is that most likely df1 is a copied object from the dataframe, and it does not reference the titanic dataframe.

It will probably help you to do like below (just replace the condition with one that includes NaN values). So instead of calling the method fillna just use the assignment operator with the proper index.

titanic.Age[titanic['Survived'] == i] = meanAge

If you don't have memory constraints is better to think about dataframes as immutable.

Instead of updating in place, try to make a copy of the original and update the new one.

CodePudding user response:

With the exception of shallow copy operations, and operations with "inplace" as an argument, all operations generate a copy.

You can directly update the titanic DataFrame after:

for i in titanic['Survived'].unique():
    meanAge = titanic.Age[titanic['Survived'] == i].mean()
    meanAge = "{:.1f}".format(meanAge)
    df = titanic['Survived'] == i
    df1 = titanic.Age[df];
    df1.fillna(meanAge, inplace=True)
    titanic.loc[df, 'Age'] = df1
  •  Tags:  
  • Related