Home > front end >  Calculating Pandas rolling values grouped by a column
Calculating Pandas rolling values grouped by a column

Time:05-26

I have the following Pandas dataframe with GDP% by country/year. I have two countries, US (with 6 rows) and FR (with 4 rows).

I need to get the 3-period moving average of the GDP% by country. If I use the rolling window function, I get the first FR moving average taking the last two moving average from US, and that's wrong. How can I roll the data grouped by country?

df = pd.DataFrame({
    'country':  ['US','US','US','US', 'US', 'US', 'FR','FR','FR','FR'],
    'year': [1990, 1991, 1992, 1993, 1994, 1995, 1990, 1991, 1992, 1993],
    'gdp': [1.2, 1.4, 1.7, 2.1, 2.3, 1.9, 4.1, 4.6, 4.3, 4.4]
})

print(df)

    country year    gdp
0   US  1990    1.2
1   US  1991    1.4
2   US  1992    1.7
3   US  1993    2.1
4   US  1994    2.3
5   US  1995    1.9
6   FR  1990    4.1
7   FR  1991    4.6
8   FR  1992    4.3
9   FR  1993    4.4

df2 = df['gdp'].rolling(3, min_periods=1).mean()

print(df2)

0    1.200000
1    1.300000
2    1.433333
3    1.733333
4    2.033333
5    2.100000
6    2.766667
7    3.533333
8    4.333333
9    4.433333

CodePudding user response:

Try this:

df['rolling 3 mean'] = df.groupby('country', group_keys=False)['gdp']\
                         .rolling(3, min_periods=1).mean().droplevel

Output:

  country  year  gdp  rolling 3 mean
0      US  1990  1.2        1.200000
1      US  1991  1.4        1.300000
2      US  1992  1.7        1.433333
3      US  1993  2.1        1.733333
4      US  1994  2.3        2.033333
5      US  1995  1.9        2.100000
6      FR  1990  4.1        4.100000
7      FR  1991  4.6        4.350000
8      FR  1992  4.3        4.333333
9      FR  1993  4.4        4.433333
  • Related