I have column in Dataframe that is PERIOD and it is in format of:
PERIOD ACTUAL
202201 343.34
202202 545.33
202203 54
202201 989.2
It means in DMY format 01-01-2022, 01-02-2022, 01-03-2022
I am using it inside:
groupedResult = df.groupby('PERIOD')['Actual'].sum()
How I can convert PERIOD values into Dates in Python?
I have tried to proceed with, but got stuck:
groupedResult = df.groupby(datetime('PERIOD'.year, 'PERIOD'.month, 1))['Actual'].sum()
CodePudding user response:
Your question wasn't totally clear as didn't have a workable example but I've had a crack at it here for you with data I made up:
import pandas as pd
data = {'period':['202201','202201','202201','202201','202202','202202','202203'], 'actuals':[10,20,30,40,50,60,70]}
df = pd.DataFrame(data)
print("BEFORE:")
This gives period as you described but it's stored as object and not datetime:
BEFORE:
period actuals
0 202201 10
1 202201 20
2 202201 30
3 202201 40
4 202202 50
5 202202 60
6 202203 70
print(df)
Here format='%Y%m'
converts it to datetime (%Y%m means search for YYYYMM in the incoming string). Then .dt.strftime('%Y/%m')
converts it back to an object format type but in the date format you require.
df['period'] = pd.to_datetime(df['period'], format='%Y%m').dt.strftime('%Y/%m')
print("AFTER:")
groupedresults = df.groupby('period')['actuals'].sum()
print(groupedresults)
And here's your output. Change the date format of period to suit your needs:
AFTER:
period
2022/01 100
2022/02 110
2022/03 70
Name: actuals, dtype: int64