Home > Blockchain >  Convert year-month into Date while GroupBy
Convert year-month into Date while GroupBy

Time:05-28

I have column in Dataframe that is PERIOD and it is in format of:

PERIOD   ACTUAL
202201   343.34 
202202   545.33
202203   54
202201   989.2

It means in DMY format 01-01-2022, 01-02-2022, 01-03-2022

I am using it inside:

groupedResult = df.groupby('PERIOD')['Actual'].sum()

How I can convert PERIOD values into Dates in Python?

I have tried to proceed with, but got stuck:

groupedResult = df.groupby(datetime('PERIOD'.year, 'PERIOD'.month, 1))['Actual'].sum()

CodePudding user response:

Your question wasn't totally clear as didn't have a workable example but I've had a crack at it here for you with data I made up:

import pandas as pd

data = {'period':['202201','202201','202201','202201','202202','202202','202203'], 'actuals':[10,20,30,40,50,60,70]}
    
df = pd.DataFrame(data)
print("BEFORE:")

This gives period as you described but it's stored as object and not datetime:

BEFORE:
   period  actuals
0  202201       10
1  202201       20
2  202201       30
3  202201       40
4  202202       50
5  202202       60
6  202203       70
print(df)

Here format='%Y%m' converts it to datetime (%Y%m means search for YYYYMM in the incoming string). Then .dt.strftime('%Y/%m') converts it back to an object format type but in the date format you require.

df['period'] = pd.to_datetime(df['period'], format='%Y%m').dt.strftime('%Y/%m')

print("AFTER:")
groupedresults = df.groupby('period')['actuals'].sum()
print(groupedresults)

And here's your output. Change the date format of period to suit your needs:

AFTER:
period
2022/01    100
2022/02    110
2022/03     70
Name: actuals, dtype: int64
  • Related