Home > Enterprise >  I want to replace missing values with 0 and 'Missing' , and in another column replace a va
I want to replace missing values with 0 and 'Missing' , and in another column replace a va

Time:10-06

Any differences among these formats? What is the difference between df['column_name' ] verses df.column_name?

df['filed_complaint'] = df.filed_complaint.fillna(0)

df['filed_complaint'].fillna(0)

df.filed_complaint.fillna(0)

How about this?

df['department'] = df.department.fillna('Missing')
df['department'].fillna('Missing')
df.department.fillna('Missing')

or these?

df['department'] = df.department.replace('information_technology', 'IT', inplace=True)

df['department'].replace('information_technology', 'IT', inplace=True)

df.department.replace('information_technology', 'IT', inplace=True)

CodePudding user response:

Any differences among these formats? What is the difference between df['column_name' ] verses df.column_name?

There is no difference. However, I suggest you to use df['column_name'] as per pandas documentation.

And regarding your two last questions, we must remember that the methods pandas.Series.fillna and pandas.Series.replace (like many others) return a copy of the referenced object (as you did with the following command df['department'].fillna('Missing')).

These two methods will not modify your original dataframe df but instead will return a copy of it. What does it means ? It means that any changes made to the latter will not affect the original df. These kind of methods can also be applied directly to your df if you specify the optional parameter inplace=True (as you did here df.department.replace('information_technology', 'IT', inplace=True)). But keep in mind that in this case, the operation of replacing the string 'information_technology' is irreversible.

Note : No need to set the inplace parameter to True if you re-assign your column!

df['department'] = df.department.replace('information_technology', 'IT', inplace=True) #wrong
  • Related