Further on from this post here...
How do I amend this solution for 1 more Key depth; or n-keys?
I have a dictionary of a static structure:
Key: Key: Key: Value
Example Dictionary:
{
"id_1": {
"Emissions": {
"305-1": [
"2014_249989",
"2015_339998",
"2016_617957",
"2017_827230"
],
"305-2": [
"2014_33163",
"2015_64280",
"2016_502748",
"2017_675091"
],
},
"Effluents and Waste": {
"306-1": [
"2014_143.29",
"2015_277.86",
"2016_385.67",
"2017_460.6"
],
"306-2": "blah blah blah",
}
}
}
I want a DataFrame of this structure:
Grand Key | Parent Key | Child Key | Child Value
Grand Key | Parent Key | Child Key | Child Value
Grand Key | Parent Key | Child Key | Child Value
Grand Key | Parent Key | Child Key | Child Value
Example Desired DataFrame:
id_1 | Emissions | 305-1 | ["2014_249989", "2015_339998", "2016_617957", "2017_827230"]
id_1 | Emissions | 305-2 | ["2014_33163", "2015_64280", "2016_502748", "2017_675091"]
id_1 | Effluents and Waste| 306-1 | ["2014_249989", "2015_339998", "2016_617957", "2017_827230"]
id_1 | Effluents and Waste | 306-2 | blah blah blah
Attempted Solution:
Many attempts, all similar to having an additional for-loop
.
data = [[key, ikey, jkey, value] for key, values in data.items() for ikey, value in values.items() for jkey, value in values.items()]
Please let me know if there are further details/ nuances I could clarify.
CodePudding user response:
Try:
import pandas as pd
data = {
"id_1": {
"Emissions": {
"305-1": ["2014_249989","2015_339998","2016_617957","2017_827230"],
"305-2": ["2014_33163","2015_64280","2016_502748","2017_675091"],
},
"Effluents and Waste": {
"306-1": ["2014_143.29","2015_277.86","2016_385.67","2017_460.6"],
"306-2": "blah blah blah",
}
}
}
def nested_items(d, path=None):
for key, value in d.items():
if isinstance(value, dict):
yield from nested_items(value, path=[key] if path is None else path [key])
else:
yield path [key], value
res = pd.DataFrame([[*path, value] for path, value in nested_items(data)])
print(res)
Output
0 ... 3
0 id_1 ... [2014_249989, 2015_339998, 2016_617957, 2017_8...
1 id_1 ... [2014_33163, 2015_64280, 2016_502748, 2017_675...
2 id_1 ... [2014_143.29, 2015_277.86, 2016_385.67, 2017_4...
3 id_1 ... blah blah blah
[4 rows x 4 columns]
CodePudding user response:
I amend the solution for 1 more Key depth
import pandas as pd
data = {
"id_1": {
"Emissions": {
"305-1": [
"2014_249989",
"2015_339998",
"2016_617957",
"2017_827230"
],
"305-2": [
"2014_33163",
"2015_64280",
"2016_502748",
"2017_675091"
],
},
"Effluents and Waste": {
"306-1": [
"2014_143.29",
"2015_277.86",
"2016_385.67",
"2017_460.6"
],
"306-2": "blah blah blah",
}
}
}
data = [
[key, ikey, iikey, value]
for key, valuess in data.items()
for ikey, values in valuess.items()
for iikey, value in values.items()
]
res = pd.DataFrame.from_dict(data)
print(res)