In the following example, I'm trying to calculate the 3-month moving average of values grouped by country using pandas rolling function:
df = pd.DataFrame({
'country': ['US','US','US','US', 'US', 'US', 'FR','FR','FR','FR'],
'year': [1990, 1991, 1992, 1993, 1994, 1995, 1990, 1991, 1992, 1993],
'gdp': [1.2, 1.4, 1.7, 2.1, 2.3, 1.9, 4.1, 4.6, 4.3, 4.4]
})
print(df)
country year gdp
0 US 1990 1.2
1 US 1991 1.4
2 US 1992 1.7
3 US 1993 2.1
4 US 1994 2.3
5 US 1995 1.9
6 FR 1990 4.1
7 FR 1991 4.6
8 FR 1992 4.3
9 FR 1993 4.4
df['ma'] = df.groupby('country', as_index=False)['gdp'].rolling(3,min_periods=1).mean()
Throws the following error:
ValueError: Wrong number of items passed 2, placement implies 1
Where am I passing 2 items instead of 1?
CodePudding user response:
print(df.groupby('country', as_index=False)['gdp'].rolling(3,min_periods=1).mean())
country gdp
6 FR 4.100000
7 FR 4.350000
8 FR 4.333333
9 FR 4.433333
0 US 1.200000
1 US 1.300000
2 US 1.433333
3 US 1.733333
4 US 2.033333
5 US 2.100000
This result returns 2 columns and 1 index. That's why it cannot be assign to 1 column in your df
.
Quick fix for problem:
df['ma'] = df.groupby('country', as_index=False)['gdp'].rolling(3,min_periods=1).mean()['gdp']
CodePudding user response:
I don't know about rolling()
but I caught an error. It can return 2 Columns in a dataframe but you can provide only one column df['ma']
. You can use this to avoid an error.
df[['item1','item2']] = df.groupby('country', as_index=False)['gdp'].rolling(3,min_periods=1).mean()
It can show output like this: