How to create different dataframes from dictionaries-CodePudding

I have a dataframe with dictionaries saved under two columns:

Name    Trust_Value                 Affordability_Value
0   J.  {'J.': 0.25, 'M.': 0.23}    {'Z.': 0.024, 'M.': 0.34}
1   M.  {'M.': 0.12, 'S.': 0.14}    {'S.': 0.017, 'B.': 0.21}
1   C.  {'S.': 0.21, 'N.': 0.13}    {'D.': 0.015, 'B.': 0.22}

For each name I would like to have separates dataframes including Name of interest, Trust_Value (key and value in separate columns) and Affordability_Value (key and value in separate columns):

df1 (J.):
Name Trust_Key      Trust_Value         Affordability_Key      Affordability_Value
0   J.  J.             0.25                    Z.                      0.024
        M.             0.23                    M.                      0.34

df2 (M.):
    Name Trust_Key      Trust_Value         Affordability_Key      Affordability_Value
    0   M.  M.             0.12                    S.                      0.017
            S.             0.14                    B.                      0.021


df3 (C.):
    Name Trust_Key      Trust_Value         Affordability_Key      Affordability_Value
    0   M.  S.             0.21                    D.                      0.015
            N.             0.13                    B.                      0.22

I have no difficulties to split key-value pairs: my difficulties are in generating different dataframes that can include these values in separate columns.

The output from df.head().to_dict() is the following (I took only the first three elements):

{'Name': {0: 'J.',
  1: 'M.',
  2: 'C.',
  },
 'Trust_Value': {0: {'J.': 0.25,
   'M.': 0.23, 'D.': 0.22, 'S.':0.12,'N.':0.12}, 1: {'M.': 0.12, 'S.': 0.14, 'C.': 0.12, 'D.': 0.12}, 2: {'S.': 0.21, 'N.': 0.13, 'C.':0.34, 'D.':0.12, 'T.':0.42}}, 'Affordability_Value':{0: {'Z.': 0.024,
   'M.': 0.34, 'D.': 0.21, 'X.':0.23,'N.':0.15}, 1: {'S.': 0.51, 'B.': 0.21, 'C.': 0.29, 'D.': 0.12}, 2: {'D.': 0.26, 'B.': 0.26, 'C.':0.38, 'D2.':0.25, 'T.':0.42}}}

CodePudding user response：

You first need to explode your dictionaries:

df2 = (df.assign(Trust_Key=df['Trust_Value'].apply(lambda d: d.values()),
                 Affordability_Key=df['Affordability_Value'].apply(lambda d: d.values())
                )
         .set_index('Name')
         .apply(pd.Series.explode)
         .reset_index()
      )

Output:

   Name Trust_Value Affordability_Value Trust_Key Affordability_Key
0    J.          J.                  Z.      0.25             0.024
1    J.          M.                  M.      0.23              0.34
2    J.          D.                  D.      0.22              0.21
3    J.          S.                  X.      0.12              0.23
4    J.          N.                  N.      0.12              0.15
5    M.          M.                  S.      0.12              0.51
6    M.          S.                  B.      0.14              0.21
7    M.          C.                  C.      0.12              0.29
8    M.          D.                  D.      0.12              0.12
...

Then you can split the new dataframe using groupby:

for name, d in df2.groupby('Name'):
    print(name)
    print(d)
    # you can save to CSV instead
    # d.to_csv(f'{name}.csv')

Output:

C.
   Name Trust_Value Affordability_Value Trust_Key Affordability_Key
9    C.          S.                  D.      0.21              0.26
10   C.          N.                  B.      0.13              0.26
11   C.          C.                  C.      0.34              0.38
12   C.          D.                 D2.      0.12              0.25
13   C.          T.                  T.      0.42              0.42
...

CodePudding user response：

To reformat your DataFrame, you can save the keys and values of your differents columns, and re-create a new DataFrame with it.

Then, in order to save a unknown number of new dataframe, it is possible to save them into a dictionnary, and acces them by their name.

import pandas as pd

df = pd.DataFrame({'Name': {0: 'J.', 1: 'M.', 2: 'C.', },
                   'Trust_Value': {0: {'J.': 0.25, 'M.': 0.23, 'D.': 0.22, 'S.':0.12,'N.':0.12}, 1: {'M.': 0.12, 'S.': 0.14, 'C.': 0.12, 'D.': 0.12}, 2: {'S.': 0.21, 'N.': 0.13, 'C.':0.34, 'D.':0.12, 'T.':0.42}},
                   'Affordability_Value':{0: {'Z.': 0.024, 'M.': 0.34, 'D.': 0.21, 'X.':0.23,'N.':0.15}, 1: {'S.': 0.51, 'B.': 0.21, 'C.': 0.29, 'D.': 0.12}, 2: {'D.': 0.26, 'B.': 0.26, 'C.':0.38, 'D.':0.25, 'T.':0.42}}})

dict_df = {}
                   
for index, row in df.iterrows():
    #Get Name
    Name = row["Name"]
    
    #Get Trust_Key & Trust_Value
    Trust_Keys = row["Trust_Value"].keys()
    Trust_Values = row["Trust_Value"].values()

    #Get Affordability_Key & Affordability_Value
    Affordability_Keys = row["Affordability_Value"].keys()
    Affordability_Values = row["Affordability_Value"].values()
    
    dfx = pd.DataFrame(list(zip(Trust_Keys, Trust_Values, Affordability_Keys, Affordability_Values)),
                       columns=["Trust_Key", "Trust_Value", "Affordability_Key", "Affordability_Value"])
    
    dict_df[f"df{Name}"] = dfx
    

#Acces the differents dataframes
for df_name in dict_df:
    print(dict_df[df_name])

Hope this answer your question!