Any differences among these formats? What is the difference between df['column_name' ] verses df.column_name?
df['filed_complaint'] = df.filed_complaint.fillna(0)
df['filed_complaint'].fillna(0)
df.filed_complaint.fillna(0)
How about this?
df['department'] = df.department.fillna('Missing')
df['department'].fillna('Missing')
df.department.fillna('Missing')
or these?
df['department'] = df.department.replace('information_technology', 'IT', inplace=True)
df['department'].replace('information_technology', 'IT', inplace=True)
df.department.replace('information_technology', 'IT', inplace=True)
CodePudding user response:
Any differences among these formats? What is the difference between df['column_name' ] verses df.column_name?
There is no difference. However, I suggest you to use df['column_name']
as per pandas documentation.
And regarding your two last questions, we must remember that the methods pandas.Series.fillna
and pandas.Series.replace
(like many others) return a copy of the referenced object (as you did with the following command df['department'].fillna('Missing')
).
These two methods will not modify your original dataframe df
but instead will return a copy of it. What does it means ? It means that any changes made to the latter will not affect the original df
. These kind of methods can also be applied directly to your df
if you specify the optional parameter inplace=True
(as you did here df.department.replace('information_technology', 'IT', inplace=True)
). But keep in mind that in this case, the operation of replacing the string 'information_technology'
is irreversible.
Note : No need to set the inplace
parameter to True
if you re-assign your column!
df['department'] = df.department.replace('information_technology', 'IT', inplace=True) #wrong