I am learning pandas and were trying to do following.
I think this is not duplicate question which is why I am posting it here.
I want to add column Average Speed
by taking mean of that particular animal available in the dataframe
. I could do it, may be not correct way. But at the end I am getting a warning.
d = {'Animal': ['Parrot','Falcon','Parrot','Falcon'], 'MaxSpeed' : [56,360,58,380 ]}
adf = pd.DataFrame(d)
grp_spd = adf.groupby(by=['Animal']).mean()
adf.insert(column='Average Speed',loc=2, value="")
for x,y in adf.iterrows():
print(x)
print(y.MaxSpeed)
print(grp_spd.loc[y.Animal].MaxSpeed )
adf['Average Speed'][x] = grp_spd.loc[y.Animal].MaxSpeed
#adf.insert(2, 'Average Speed', grp_spd.loc[y.Animal].MaxSpeed)
adf
I am getting following warning message
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy adf['Average Speed'][x] = grp_spd.loc[y.Animal].MaxSpeed
Can someone tell me how to get rid of this warning and what is the right way to do it.
CodePudding user response:
The reason is when adf['Average Speed'][x] = value
is executed, it is not guaranteed if a view is accessed or a reference to the object. Setting value on a view has a risk that as the view object gets cleared, the changes will be lost. You can read more details here.
You can modify your source code as:
d = {'Animal': ['Parrot','Falcon','Parrot','Falcon'], 'MaxSpeed' : [56,360,58,380 ]}
adf = pd.DataFrame(d)
adf["Average Speed"] = adf.groupby("Animal")["MaxSpeed"].transform("mean")
Animal MaxSpeed Average Speed
0 Parrot 56 57.0
1 Falcon 360 370.0
2 Parrot 58 57.0
3 Falcon 380 370.0