I looked at several questions but did not find an answer to convert a nested dictionary with this irregular structure:
a = {'Cat0': {'brand1': {'b': 0.78, 'c': 1}, 'brand2': {'k': 1, 'c': 1}}, 'Cat1': {'brand4': {'b': 10, 's': 0.0}}, 'Cat2': {'brand1': {'j': 1, 'c': 0.0}}}
to the following pandas dataframe:
Category Brand Peer Value
0 Cat0 brand1 b 0.78
1 Cat0 brand1 c 1
2 Cat0 brand2 k 1
3 Cat0 brand2 c 1
4 Cat1 brand4 b 10
5 Cat1 brand4 s 0.0
6 Cat2 brand1 j 1
7 Cat2 brand1 c 0.0
The dictionary is going to be huge, so I am lookingfor the most efficient solution. Can you help me?
CodePudding user response:
It's quite straightforward with a comprehension to flatten the dictionary:
df = pd.DataFrame([[k, k1, k2, v]
for k, d in a.items()
for k1, d1 in d.items()
for k2, v in d1.items()],
columns=['Category', 'Brand', 'Peer', 'Value'])
You might get a slightly better efficiency using a generator instead of the comprehension (to be tested on the real data):
df = pd.DataFrame(([k, k1, k2, v]
for k, d in a.items()
for k1, d1 in d.items()
for k2, v in d1.items()),
columns=['Category', 'Brand', 'Peer', 'Value'])
Output:
Category Brand Peer Value
0 Cat0 brand1 b 0.78
1 Cat0 brand1 c 1.00
2 Cat0 brand2 k 1.00
3 Cat0 brand2 c 1.00
4 Cat1 brand4 b 10.00
5 Cat1 brand4 s 0.00
6 Cat2 brand1 j 1.00
7 Cat2 brand1 c 0.00