I am trying to do some simple replacements and I keep getting the dreaded copy warning.
bvr_filtered = bvr.loc[~ bvr['LOCATION'].str.endswith('NEW')]
# set LOC_TYPE to 1100 if null
bvr_filtered.loc[bvr_filtered['LOC_TYPE'].isna(), 'LOC_TYPE'] = 1100
# fix errors in OWNED_BY
owned_by = bvr_filtered.loc[:, 'OWNED_BY']
owned_by.loc[owned_by.isna()] = 'N/A'
The line: 'owned_by.loc[owned_by.isna()] = "N/A"' is throwing the warning. What am I doing wrong?
I looked at the Pandas documentation and tried the .loc, but it seems I am not understanding the issue.
CodePudding user response:
The SettingWithCopyWarning
is raised when you try to modify a copy of a DataFrame
without specifying that you want to modify the original DataFrame
.
This often occurs when you use indexing to select a subset of a DataFrame
and try to modify that subset.
For example:
df_subset = df[df['A'] > 0]
df_subset['A'] = 0 # this will raise a SettingWithCopyWarning
In your code snippet, you are using the .loc
indexer to select a subset of the bvr
DataFrame
, and then trying to modify that subset by assigning a new value to the 'LOC_TYPE'
column. Then you use the '.loc'
indexer again to select the rows from 'OWNED_BY'
column to create a new view called owned_by
based on the bvr_filtered
DataFrame
. Both operations will raise a SettingWithCopyWarning
, because you are modifying a copy of bvr
and then a copy of bvr_filtered
DataFrame
, not the original DataFrame
itself.
To fix the warning, you can use the .loc
indexer to modify the original bvr
DataFrame
directly.
For example:
bvr.loc[bvr['LOC_TYPE'].isna(), 'LOC_TYPE'] = 1100
bvr.loc[bvr['OWNED_BY'].isna(), 'OWNED_BY']= 'N/A'
bvr_filtered = bvr_filtered.loc[~bvr_filtered['LOCATION'].str.endswith('NEW')]
owned_by = bvr_filtered['OWNED_BY']
Alternatively, you can use the .copy()
method to create an explicit copy of the bvr
DataFrame
, and then modify the copy without raising a warning.
For example:
bvr_filtered = bvr.copy()
bvr_filtered = bvr_filtered.loc[~bvr_filtered['LOCATION'].str.endswith('NEW')]
bvr_filtered.loc[bvr_filtered['LOC_TYPE'].isna(), 'LOC_TYPE'] = 1100
bvr_filtered.loc[bvr_filtered['OWNED_BY'].isna(), 'OWNED_BY']= 'N/A'
owned_by = bvr_filtered['OWNED_BY']