Home > database >  Rolling average - ValueError: Columns must be same length as key
Rolling average - ValueError: Columns must be same length as key

Time:01-11

I am trying to calculate a moving average (3 years) but I get a different size object.

counts = dfen.groupby(['From date', 'Policy category 1']).size().reset_index(name='counts')
t = counts['counts']
counts[t_average] = t.rolling(2).sum()

I get:

ValueError                                Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_6032\1024416527.py in <module>
2 counts = dfen.groupby(['From date', 'Policy category 1']).size().reset_index(name='counts')
3 t = counts['counts']
4 counts[t_average] = t.rolling(2).sum()
5 
6 

~\anaconda3\lib\site-packages\pandas\core\frame.py in __setitem__(self, key, value)
3641             self._setitem_frame(key, value)
3642         elif isinstance(key, (Series, np.ndarray, list, Index)):
3643             self._setitem_array(key, value)
3644         elif isinstance(value, DataFrame):
3645             self._set_item_frame_value(key, value)

~\anaconda3\lib\site-packages\pandas\core\frame.py in _setitem_array(self, key, value)
3700 
3701             else:
3702                 self._iset_not_inplace(key, value)
3703 
3704     def _iset_not_inplace(self, key, value):

~\anaconda3\lib\site-packages\pandas\core\frame.py in _iset_not_inplace(self, key, value)
3719         if self.columns.is_unique:
3720             if np.shape(value)[-1] != len(key):
3721                 raise ValueError("Columns must be same length as key")
3722 
3723             for i, col in enumerate(key):

ValueError: Columns must be same length as key

Does anyone know what I am doing wrong? Thanks in advance

I read the pandas documentation and it says that NaN are filled automatically for starting and ending values (Iàm calculating a centered average)... and the example in the documentation is the same... (https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.rolling.html) I also searched in Stackoverflow but dont see this issue, so I am quite lost here.

CodePudding user response:

you can achieve in this way...

count = '30' #Simple Moving Average with 30 window box
df.dropna(inplace=True) #removing null data
df['MA30'] = df.rolling(count).mean() #Simple Moving Average Formula

CodePudding user response:

I could solve it:

counts = dfen.groupby(['From date', 'Policy category 1']).size().reset_index(name='counts') 
counts['t_average'] = counts['counts'].rolling(3).mean(center='right') 
print(len(counts['counts'])) print(len(counts['t_average']))

# Plot the counts for each policy code in facets plt.figure(figsize=(20,35)) plt.title("Number of Elements by Year and Policy Code")

for i, code in enumerate(dfen['Policy category 1'].unique()):
    plt.subplot(9, 4, i 1)
    plt.title(code)
    counts_subset = counts[counts['Policy category 1'] == code]
    plt.plot(counts_subset['From date'], counts_subset['t_average'],color=np.random.rand(3,))
    plt.xlim(2000, 2021)
    plt.ylim(0, 40) plt.show()
  • Related