Home > Back-end >  How to convert monthly data to yearly data with the value as the mean average over the 12 months? (P
How to convert monthly data to yearly data with the value as the mean average over the 12 months? (P

Time:02-02

This is what my data looks like:

       month  total_mobile_subscription
0    1997-01                     414000
1    1997-02                     423000
2    1997-03                     431000
3    1997-04                     479000
4    1997-05                     510000
..       ...                        ...
279  2020-04                    9056300
280  2020-05                    8928800
281  2020-06                    8860000
282  2020-07                    8768500
283  2020-08                    8659000

[284 rows x 2 columns]

Basically, I'm trying to change this into a dataset sorted by year with the value being the mean average of total mobiles subscriptions for each year.

I am not sure what to do as I am still learning this.

CodePudding user response:


import pandas as pd
import numpy as np
df = pd.DataFrame({
    'year': ['1997-01', '1997-02', '1997-03', '1998-01', '1998-02', '1998-03'],
    'sale': [500, 1000, 1500, 2000, 1000, 400]
    })


for a in range(1,13):
    x = '-0'  str(a)
    df['year'] = df['year'].str.replace(x, '')
    

df2 = (df.groupby(['year']).mean('sale'))
print(df2)

CodePudding user response:

Convert values of month to datetimes and aggregate by years:

y = pd.to_datetime(df['month']).dt.year.rename('year')
df1 = df.groupby(y)['total_mobile_subscription'].mean().reset_index()

df['month'] = pd.to_datetime(df['month'])
df1 = (df.groupby(df['month'].dt.year.rename('year'))['total_mobile_subscription']
         .mean().reset_index())

Or aggregate by first 4 values of month column:

df2 = (df.groupby(df['month'].str[:4].rename('year'))['total_mobile_subscription']
         .mean().reset_index())
  • Related