I'm working on a pandas dataframe. The first thing is to filter and create a new dataframe(df1)
from the original dataframe(df)
based on number that i specify in num_posts column
and user column is user1
, then next step is to update the num_posts
to another number, and finalize by updating df
from df1
.
The original df is:
df = pd.DataFrame({'num_posts': [4, 4, 3, 4, 1, 14],
'date': ['2020-08-09', '2020-08-25',
'2020-09-05', '2020-09-12',
'2020-09-29', '2020-10-15'],
'user': ['user1', 'user1', 'user2', 'user3', 'user4', 'user4']})
# The new filtered df1
# filter posts that equal 4 and user is user1
df1 = df.loc[(df['num_posts'] == 4) & (df['user'] == 'user1')]
df1
# overwrite the num_posts column with 10
for i in df1.index:
df1.loc[i, 'num_posts'] = 10
# Updating the original dataframe df with df1
df.update(df1)
df
When i run my code i get the following warning displayed.
C:\Program Files\Python38\lib\site-packages\pandas\core\indexing.py:1817: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
self._setitem_single_column(loc, value, pi)
On opening the link in the warning message, i'm redirected to pandas official website, the issue seems to be chained indexing. I need assistance to know how to get rid of it and avoid it on successive filtering of the same original dataframe df.
CodePudding user response:
If it helps, try this:
#df1 = df.loc[(df['num_posts'] == 4)].copy()
df1 = df.loc[(df['num_posts'] == 4) & (df['user'] == 'user1')].copy()
description here
Output
num_posts date user
0 10.0 2020-08-09 user1
1 10.0 2020-08-25 user1
2 3.0 2020-09-05 user2
3 4.0 2020-09-12 user3
4 1.0 2020-09-29 user4
5 14.0 2020-10-15 user4