Hello can you help me understand what is the issue here and how to solve it?
dft = pd.DataFrame({'B': [0, 1, 2, 5, 4, 7, 2, 2, 2, 5, 6, 7]})
dft['user'] = ['a','b','c','b','a','c','a','b','b','c','a', 'c']
dft.groupby('user')['B'].transform(lambda row: row.ewm(span=2)).mean()
gives ValueError: Length of passed values is 1, index implies 4.
CodePudding user response:
I suppose that invocation of mean()
should be a part of your lambda function.
So maybe your code should be:
dft.groupby('user')['B'].transform(lambda row: row.ewm(span=2).mean())
For your sample data I got:
0 0.000000
1 1.000000
2 2.000000
3 4.000000
4 3.000000
5 5.750000
6 2.307692
7 2.615385
8 2.200000
9 5.230769
10 4.800000
11 6.425000
Name: B, dtype: float64
Another hint is that transform invokes the passed function to each column, not row. So argument named row is in this case misleading.
CodePudding user response:
Use ewm
on the groupby directly, support has been added for a while now:
dft = pd.DataFrame({'B': [0, 1, 2, 5, 4, 7, 2, 2, 2, 5, 6, 7]})
dft['user'] = ['a','b','c','b','a','c','a','b','b','c','a', 'c']
dft.groupby('user')['B'].ewm(span=2).mean()
output:
user
a 0 0.000000
4 3.000000
6 2.307692
10 4.800000
b 1 1.000000
3 4.000000
7 2.615385
8 2.200000
c 2 2.000000
5 5.750000
9 5.230769
11 6.425000
Name: B, dtype: float64