The researchpy's summary_cont() page there is an example and it is given as;
import numpy, pandas, researchpy
numpy.random.seed(12345678)
df = pandas.DataFrame(numpy.random.randint(10, size= (100, 2)),
columns= ['healthy', 'non-healthy'])
df['tx'] = ""
df['tx'].iloc[0:50] = "Placebo"
df['tx'].iloc[50:101] = "Experimental"
df['dose'] = ""
df['dose'].iloc[0:26] = "10 mg"
df['dose'].iloc[26:51] = "25 mg"
df['dose'].iloc[51:76] = "10 mg"
df['dose'].iloc[76:101] = "25 mg"
produces warning
summury_cont.py:8: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df['tx'].iloc[0:50] = "Placebo"
This asks to see this page
Converting into this
df.loc[:, ('tx')].iloc[0:50] = "Placebo"
It still produces the same warning. What is the correct way of this?
CodePudding user response:
Values are assigned as shown below. Row indexes on the left, column names on the right. Explicit 'loc' indexing is used. You can see the difference between explicit and implicit 'iloc' indexing here
import numpy, pandas
numpy.random.seed(12345678)
df = pandas.DataFrame(data = numpy.random.randint(10, size= (100, 2)),
columns= ['healthy', 'non-healthy'])
df['tx'] = ""
df.loc[0:50, 'tx'] = "Placebo"
df.loc[50:101, 'tx'] = "Experimental"
df['dose'] = ""
df.loc[0:26, 'dose'] = "10 mg"
df.loc[26:51, 'dose'] = "25 mg"
df.loc[51:76, 'dose'] = "10 mg"
df.loc[76:101, 'dose'] = "25 mg"
print(df)
Output
healthy non-healthy tx dose
0 3 2 Placebo 10 mg
1 4 1 Placebo 10 mg
2 0 1 Placebo 10 mg
3 8 2 Placebo 10 mg
4 6 6 Placebo 10 mg
.. ... ... ... ...
95 8 5 Experimental 25 mg
96 8 3 Experimental 25 mg
97 4 0 Experimental 25 mg
98 4 3 Experimental 25 mg
99 6 9 Experimental 25 mg