Home > Blockchain >  Non replicable SettingWithCopyWarning in Pandas
Non replicable SettingWithCopyWarning in Pandas

Time:08-27

I am experiencing a weird case of the SettingWithCopyWarning not behaving as I would expect it to behave.

I have a dataframe with lots of columns and over 100000 rows of Facebook data of posts published by a public page. The five columns that I am interested in are 'Page Name', 'Created', 'Message', 'Image Text','Link Text', and 'Description'.

First, I extract the columns I need using the following two lines of code (let's call this lines 1 and 2):

reqd_cols = ['Page Name', 'Created', 'Message', 'Link Text', 'Image Text', 'Description']
reqd_dat = raw_dat[reqd_cols]

The Created column has the timestamp (str) of when the post was created (for e.g. 2021-02-08 20:06:19 EST). My goal is to extract the Date from this column and store it in a new column called "Date"

I am able to extract the Date and create a list using

reqd_dat.loc[:,'Created'].str.split().str[0].tolist()

However, when I do (line 3)

reqd_dat.loc[:,'Date'] = reqd_dat.loc[:,'Created'].str.split().str[0].tolist()

I get the dreaded SettingWithCopyWarning. However, after getting the warning, when I rerun lines 1 and 2, thereby effectively recreating reqd_dat, line 3 no longer throws that warning.

What am I missing?

CodePudding user response:

It looks like your reqd_dat is a view of raw_dat. I would try changing line 2 to:

reqd_dat = raw_dat[reqd_cols].copy()
  • Related