I am trying to calculate a moving average (3 years) but I get a different size object.
counts = dfen.groupby(['From date', 'Policy category 1']).size().reset_index(name='counts')
t = counts['counts']
counts[t_average] = t.rolling(2).sum()
I get:
ValueError Traceback (most recent call last) ~\AppData\Local\Temp\ipykernel_6032\1024416527.py in <module> 2 counts = dfen.groupby(['From date', 'Policy category 1']).size().reset_index(name='counts') 3 t = counts['counts'] 4 counts[t_average] = t.rolling(2).sum() 5 6 ~\anaconda3\lib\site-packages\pandas\core\frame.py in __setitem__(self, key, value) 3641 self._setitem_frame(key, value) 3642 elif isinstance(key, (Series, np.ndarray, list, Index)): 3643 self._setitem_array(key, value) 3644 elif isinstance(value, DataFrame): 3645 self._set_item_frame_value(key, value) ~\anaconda3\lib\site-packages\pandas\core\frame.py in _setitem_array(self, key, value) 3700 3701 else: 3702 self._iset_not_inplace(key, value) 3703 3704 def _iset_not_inplace(self, key, value): ~\anaconda3\lib\site-packages\pandas\core\frame.py in _iset_not_inplace(self, key, value) 3719 if self.columns.is_unique: 3720 if np.shape(value)[-1] != len(key): 3721 raise ValueError("Columns must be same length as key") 3722 3723 for i, col in enumerate(key): ValueError: Columns must be same length as key
Does anyone know what I am doing wrong? Thanks in advance
I read the pandas documentation and it says that NaN are filled automatically for starting and ending values (Iàm calculating a centered average)... and the example in the documentation is the same... (https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.rolling.html) I also searched in Stackoverflow but dont see this issue, so I am quite lost here.
CodePudding user response:
you can achieve in this way...
count = '30' #Simple Moving Average with 30 window box
df.dropna(inplace=True) #removing null data
df['MA30'] = df.rolling(count).mean() #Simple Moving Average Formula
CodePudding user response:
I could solve it:
counts = dfen.groupby(['From date', 'Policy category 1']).size().reset_index(name='counts')
counts['t_average'] = counts['counts'].rolling(3).mean(center='right')
print(len(counts['counts'])) print(len(counts['t_average']))
# Plot the counts for each policy code in facets plt.figure(figsize=(20,35)) plt.title("Number of Elements by Year and Policy Code")
for i, code in enumerate(dfen['Policy category 1'].unique()):
plt.subplot(9, 4, i 1)
plt.title(code)
counts_subset = counts[counts['Policy category 1'] == code]
plt.plot(counts_subset['From date'], counts_subset['t_average'],color=np.random.rand(3,))
plt.xlim(2000, 2021)
plt.ylim(0, 40) plt.show()