I have been struggling with transforming the below x dictionary, into the table/dictionary below.
x = {'John': 0,
'Dan': 0,
'Tim': 0,
'Andrew': ['2022-04-10','2022-04-11','2022-06-16','2022-06-17','2022-06-18','2022-08-19','2022-08-20','2022-11-24','2022-12-12'],
'Jack': ['2021-12-31','2022-01-01','2022-01-02','2022-03-26','2022-03-27','2022-03-28','2022-05-29','2022-06-01','2022-06-10','2022-06-12','2022-08-13']}
I want to put the names (keys) into Column 0, the dates I would like to present just the count per each year-month column
Result:
df = pd.DataFrame(data = {'Name':['John', 'Dan', 'Tim', 'Andrew', 'Jack'],
'2021-12':[0,0,0,0,1],
'2022-01' :[0,0,0,0,2],
'2022-02' :[0,0,0,0,0],
'2022-03' :[0,0,0,0,3],
'2022-04' :[0,0,0,1,0],
'2022-05' :[0,0,0,0,1],
'2022-06' :[0,0,0,3,3],
'2022-07' :[0,0,0,0,0],
'2022-08' :[0,0,0,2,1],
'2022-09' :[0,0,0,0,0],
'2022-10' :[0,0,0,0,0],
'2022-11' :[0,0,0,1,0],
'2022-12' :[0,0,0,1,0]})
This is the final df result(values are representing the counts per each Month):
CodePudding user response:
IIUC, you can try something like this:
x = {'John': 0,
'Dan': 0,
'Tim': 0,
'Andrew': ['2022-04-10','2022-04-11','2022-06-16','2022-06-17','2022-06-18','2022-08-19','2022-08-20','2022-11-24','2022-12-12'],
'Jack': ['2021-12-31','2022-01-01','2022-01-02','2022-03-26','2022-03-27','2022-03-28','2022-05-29','2022-06-01','2022-06-10','2022-06-12','2022-08-13']}
dfe = pd.DataFrame.from_dict(x, 'index').explode(0).set_axis(['dates'], axis=1).rename_axis('name').reset_index()
dfe['dates'] = pd.to_datetime(dfe['dates'], format='%Y-%m-%d')
df_out = dfe.groupby(['name',pd.Grouper(key='dates', freq='M')]).size().unstack(fill_value=0)
df_out = df_out.drop('1970-01-31', axis=1)
df_out = df_out.reindex(pd.date_range(df_out.columns.min(), df_out.columns.max(), freq='M'), axis=1, fill_value=0)
df_out.columns = df_out.columns.strftime('%Y-%m')
print(df_out)
Output:
2021-12 2022-01 2022-02 2022-03 2022-04 2022-05 2022-06 2022-07 2022-08 2022-09 2022-10 2022-11 2022-12
name
Andrew 0 0 0 0 2 0 3 0 2 0 0 1 1
Dan 0 0 0 0 0 0 0 0 0 0 0 0 0
Jack 1 2 0 3 0 1 3 0 1 0 0 0 0
John 0 0 0 0 0 0 0 0 0 0 0 0 0
Tim 0 0 0 0 0 0 0 0 0 0 0 0 0
Working with periods instead of dates and converting to strings....
x = {'John': 0,
'Dan': 0,
'Tim': 0,
'Andrew': ['2022-04-10','2022-04-11','2022-06-16','2022-06-17','2022-06-18','2022-08-19','2022-08-20','2022-11-24','2022-12-12'],
'Jack': ['2021-12-31','2022-01-01','2022-01-02','2022-03-26','2022-03-27','2022-03-28','2022-05-29','2022-06-01','2022-06-10','2022-06-12','2022-08-13']}
dfe = pd.DataFrame.from_dict(x, 'index').explode(0).set_axis(['dates'], axis=1).rename_axis('name').reset_index()
dfe['dates'] = pd.to_datetime(dfe['dates'], format='%Y-%m-%d').dt.to_period('M')
df_out = dfe.groupby(['name', 'dates']).size().unstack(fill_value=0)
df_out = df_out.drop('1970-01', axis=1)
df_out = df_out.reindex(pd.period_range(df_out.columns.min(), df_out.columns.max(), freq='M'), axis=1, fill_value=0)
print(df_out)
Output:
2021-12 2022-01 2022-02 2022-03 2022-04 2022-05 2022-06 2022-07 2022-08 2022-09 2022-10 2022-11 2022-12
name
Andrew 0 0 0 0 2 0 3 0 2 0 0 1 1
Dan 0 0 0 0 0 0 0 0 0 0 0 0 0
Jack 1 2 0 3 0 1 3 0 1 0 0 0 0
John 0 0 0 0 0 0 0 0 0 0 0 0 0
Tim 0 0 0 0 0 0 0 0 0 0 0 0 0