Home > Back-end >  Pandas set part of a row as the mean of the row - SettingWithCopyWarning
Pandas set part of a row as the mean of the row - SettingWithCopyWarning

Time:10-15

I have looked at the following post, but did not help: How to deal with SettingWithCopyWarning in Pandas

My question:

I have this dataframe called sample

        PERMNO  date    SHRCD   EXCHCD  TICKER  COMNAM  FACPR   PRC SHROUT  OPENPRC         marketcap
151421  10113   2010-07-21  73.0    4.0 AADR    ADVISORSHARES TRUST 0.0 24.70   100.0   25.10   2470.0
151422  10113   2010-07-22  73.0    4.0 AADR    ADVISORSHARES TRUST 0.0 25.26   100.0   25.42   2526.0
151423  10113   2010-07-23  73.0    4.0 AADR    ADVISORSHARES TRUST 0.0 25.28   100.0   25.54   2528.0
151424  10113   2010-07-26  73.0    4.0 AADR    ADVISORSHARES TRUST 0.0 25.37   100.0   25.40   2537.0
151425  10113   2010-07-27  73.0    4.0 AADR    ADVISORSHARES TRUST 0.0 25.29   100.0   25.25   2529.0
... ... ... ... ... ... ... ... ... ... ... ...
153292  10113   2017-12-22  73.0    4.0 AADR    ADVISORSHARES TRUST 0.0 58.93   2650.0  58.80   156164.5
153293  10113   2017-12-26  73.0    4.0 AADR    ADVISORSHARES TRUST 0.0 58.86   2650.0  58.69   155979.0
153294  10113   2017-12-27  73.0    4.0 AADR    ADVISORSHARES TRUST 0.0 58.83   2650.0  58.85   155899.5
153295  10113   2017-12-28  73.0    4.0 AADR    ADVISORSHARES TRUST 0.0 58.75   2650.0  59.07   155687.5
153296  10113   2017-12-29  73.0    4.0 AADR    ADVISORSHARES TRUST 0.0 58.85   2850.0  59.08   167722.5
1570 rows × 11 columns

I want to create a new row named 10113 which is a copy of market. And from column [1::] I want it to be the mean of the market cap.

But I get a warning

C:\Users\waahm\AppData\Local\Temp/ipykernel_8464/2959211184.py:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  sample['marketcap'][1::] = marketcap_mean

My code is:

sample[10113] = sample['marketcap'].copy()
marketcap_mean = sample['marketcap'][1::].mean()
sample['marketcap'][1::] = marketcap_mean

How can I get rid of the warning? And what am I doing wrong?

CodePudding user response:

it's usually due to Chain Assignment, Chained assignment is the combination of chaining and assignment. The warning was generated because we have chained two indexing operations together.

These two chained operations execute independently, one after another. The first is an access method (get operation), that will return a DataFrame with all rows The second is an assignment operation (set operation), that is called on this new DataFrame. We are not operating on the original DataFrame at all.

=> The solution is simple: combine the chained operations into a single operation using loc so that pandas can ensure the original DataFrame is set. Pandas will always ensure that unchained set operations work. (check this link for more info https://www.dataquest.io/blog/settingwithcopywarning/)

sample.loc[1::,'marketcap']= marketcap_mean
  • Related