I am looking for any advice on how to cleanly convert a python multi level nested dictionary (from JSON) into a data frame boolean table.
Rules:
- Only True is recorded. If empty it is False.
- The list may be of N Length
- The groups may be of N length
- The bools may be of N types
Example Input:
{1:{group_a:{bool_a:True,
bool_b:True,
bool_n:True},
group_n:{bool_b:True,
bool_n:True}
},
2:{group_a:{bool_a:True,
bool_b:True,
bool_n:True},
group_n:{bool_b:True,
bool_n:True}
},
n:{group_a:{bool_a:True,
bool_c:True},
group_n:{bool_b:True}
},
}
Desired Output:
Ga_Ba, Ga_Bb, Ga_Bc, Ga_Bn, Gn_Ba, Gn_Bb, ... Gn_Bn....
1 True True False True False True True
2 True True False True False True True
n True False True False False False False
...
Ideas? Bonus points for speed and conciseness. I have a solution but I am looking for something more elegant than the for loop mess I have now. Alternative data structures may also be welcome.
CodePudding user response:
Goofy method #1
s = pd.DataFrame.from_dict(data, orient='index').stack()
pd.json_normalize(s).set_index(s.index) \
.stack().unstack([1, 2], fill_value=False) \
.sort_index(axis=1)
group_a group_n
bool_a bool_b bool_c bool_n bool_b bool_n
1 True True False True True True
2 True True False True True True
3 True False True False True False
CodePudding user response:
You could use a dictionary comprehension and concat
:
df = (pd.concat({k: pd.DataFrame(v).stack()
for k,v in d.items()}, axis=1)
.T.fillna(False)
)