Home > other >  Effecient conversion of multilevel nested dictionary to df
Effecient conversion of multilevel nested dictionary to df

Time:11-25

I looked at several questions but did not find an answer to convert a nested dictionary with this irregular structure:

a = {'Cat0': {'brand1': {'b': 0.78, 'c': 1}, 'brand2': {'k': 1, 'c': 1}}, 'Cat1': {'brand4': {'b': 10, 's': 0.0}}, 'Cat2': {'brand1': {'j': 1, 'c': 0.0}}}

to the following pandas dataframe:

    Category    Brand  Peer Value
0   Cat0        brand1 b    0.78
1   Cat0        brand1 c    1
2   Cat0        brand2 k    1
3   Cat0        brand2 c    1
4   Cat1        brand4 b    10
5   Cat1        brand4 s    0.0
6   Cat2        brand1 j    1
7   Cat2        brand1 c    0.0

The dictionary is going to be huge, so I am lookingfor the most efficient solution. Can you help me?

CodePudding user response:

It's quite straightforward with a comprehension to flatten the dictionary:

df = pd.DataFrame([[k, k1, k2, v]
                   for k, d in a.items()
                   for k1, d1 in d.items()
                   for k2, v in d1.items()],
                 columns=['Category', 'Brand', 'Peer', 'Value'])

You might get a slightly better efficiency using a generator instead of the comprehension (to be tested on the real data):

df = pd.DataFrame(([k, k1, k2, v]
                   for k, d in a.items()
                   for k1, d1 in d.items()
                   for k2, v in d1.items()),
                 columns=['Category', 'Brand', 'Peer', 'Value'])

Output:

  Category   Brand Peer  Value
0     Cat0  brand1    b   0.78
1     Cat0  brand1    c   1.00
2     Cat0  brand2    k   1.00
3     Cat0  brand2    c   1.00
4     Cat1  brand4    b  10.00
5     Cat1  brand4    s   0.00
6     Cat2  brand1    j   1.00
7     Cat2  brand1    c   0.00
  • Related