maindata['avg_delay']= maindata.groupby('name_customer')['Delay'].mean(numeric_only=False)
maindata.avg_delay
output:
0 NaT
1 NaT
2 NaT
4 NaT
5 NaT
..
49994 NaT
49996 NaT
49997 NaT
49998 NaT
49999 NaT
Name: avg_delay, Length: 40000, dtype: timedelta64[ns]
CodePudding user response:
maindata.groupby('name_customer')['Delay'].mean(numeric_only=False)
gives you a pd.Series
with values of 'name_customer'
as the Series's indices. Note that when you assign a pd.Series
to a column of a dataframe, the assignment is index-by-index. Because your maindata
's indices is not values from 'name_customer'
, the two sets of indices do not match with each other, and thus the result you observed.
As the doc says, transform
:
Call function producing a like-indexed DataFrame on each group and return a DataFrame having the same indexes as the original object filled with the transformed values
so you may use the following line instead, but do check if the outcome is what you need or not.
maindata.groupby('name_customer')['Delay'].transform('mean')