Home > Enterprise >  Transform Dictionary with date values into Dataframe with date values as counts per each month
Transform Dictionary with date values into Dataframe with date values as counts per each month

Time:12-31

I have been struggling with transforming the below x dictionary, into the table/dictionary below.

 x = {'John': 0,
 'Dan': 0,
 'Tim': 0,
 'Andrew': ['2022-04-10','2022-04-11','2022-06-16','2022-06-17','2022-06-18','2022-08-19','2022-08-20','2022-11-24','2022-12-12'],
 'Jack': ['2021-12-31','2022-01-01','2022-01-02','2022-03-26','2022-03-27','2022-03-28','2022-05-29','2022-06-01','2022-06-10','2022-06-12','2022-08-13']}

I want to put the names (keys) into Column 0, the dates I would like to present just the count per each year-month column

Result:

df = pd.DataFrame(data = {'Name':['John', 'Dan', 'Tim', 'Andrew', 'Jack'],
    '2021-12':[0,0,0,0,1],
   '2022-01' :[0,0,0,0,2],
   '2022-02' :[0,0,0,0,0],
   '2022-03' :[0,0,0,0,3],
   '2022-04' :[0,0,0,1,0],
    '2022-05' :[0,0,0,0,1],
    '2022-06' :[0,0,0,3,3],
    '2022-07' :[0,0,0,0,0],
    '2022-08' :[0,0,0,2,1],
    '2022-09' :[0,0,0,0,0],
    '2022-10' :[0,0,0,0,0],
    '2022-11' :[0,0,0,1,0],
    '2022-12' :[0,0,0,1,0]})

This is the final df result(values are representing the counts per each Month): enter image description here

CodePudding user response:

IIUC, you can try something like this:

x = {'John': 0,
 'Dan': 0,
 'Tim': 0,
 'Andrew': ['2022-04-10','2022-04-11','2022-06-16','2022-06-17','2022-06-18','2022-08-19','2022-08-20','2022-11-24','2022-12-12'],
 'Jack': ['2021-12-31','2022-01-01','2022-01-02','2022-03-26','2022-03-27','2022-03-28','2022-05-29','2022-06-01','2022-06-10','2022-06-12','2022-08-13']}

dfe = pd.DataFrame.from_dict(x, 'index').explode(0).set_axis(['dates'], axis=1).rename_axis('name').reset_index()

dfe['dates'] = pd.to_datetime(dfe['dates'], format='%Y-%m-%d')

df_out = dfe.groupby(['name',pd.Grouper(key='dates', freq='M')]).size().unstack(fill_value=0)
df_out = df_out.drop('1970-01-31', axis=1)
df_out = df_out.reindex(pd.date_range(df_out.columns.min(), df_out.columns.max(), freq='M'), axis=1, fill_value=0)
df_out.columns = df_out.columns.strftime('%Y-%m')
print(df_out)

Output:

        2021-12  2022-01  2022-02  2022-03  2022-04  2022-05  2022-06  2022-07  2022-08  2022-09  2022-10  2022-11  2022-12
name                                                                                                                       
Andrew        0        0        0        0        2        0        3        0        2        0        0        1        1
Dan           0        0        0        0        0        0        0        0        0        0        0        0        0
Jack          1        2        0        3        0        1        3        0        1        0        0        0        0
John          0        0        0        0        0        0        0        0        0        0        0        0        0
Tim           0        0        0        0        0        0        0        0        0        0        0        0        0

Working with periods instead of dates and converting to strings....

x = {'John': 0,
 'Dan': 0,
 'Tim': 0,
 'Andrew': ['2022-04-10','2022-04-11','2022-06-16','2022-06-17','2022-06-18','2022-08-19','2022-08-20','2022-11-24','2022-12-12'],
 'Jack': ['2021-12-31','2022-01-01','2022-01-02','2022-03-26','2022-03-27','2022-03-28','2022-05-29','2022-06-01','2022-06-10','2022-06-12','2022-08-13']}

dfe = pd.DataFrame.from_dict(x, 'index').explode(0).set_axis(['dates'], axis=1).rename_axis('name').reset_index()

dfe['dates'] = pd.to_datetime(dfe['dates'], format='%Y-%m-%d').dt.to_period('M')

df_out = dfe.groupby(['name', 'dates']).size().unstack(fill_value=0)
df_out = df_out.drop('1970-01', axis=1)
df_out = df_out.reindex(pd.period_range(df_out.columns.min(), df_out.columns.max(), freq='M'), axis=1, fill_value=0)
print(df_out)

Output:

        2021-12  2022-01  2022-02  2022-03  2022-04  2022-05  2022-06  2022-07  2022-08  2022-09  2022-10  2022-11  2022-12
name                                                                                                                       
Andrew        0        0        0        0        2        0        3        0        2        0        0        1        1
Dan           0        0        0        0        0        0        0        0        0        0        0        0        0
Jack          1        2        0        3        0        1        3        0        1        0        0        0        0
John          0        0        0        0        0        0        0        0        0        0        0        0        0
Tim           0        0        0        0        0        0        0        0        0        0        0        0        0
  • Related