Home > Mobile >  Construct pandas DataFrame from items in 3 level nested dictionary with to list of values
Construct pandas DataFrame from items in 3 level nested dictionary with to list of values

Time:01-21

I have the following dictionary:

dict = {
"Building 1": {
    "Energy consumption": {
        "Datetime": ["2020-12-28","2021-01-04"],
        "Value": [537,967]},
     "Water consumption": {
        "Datetime": ["2020-12-28","2021-01-04"],
        "Value": [537,967]}},
"Building 2": {
    "Energy consumption": {
        "Datetime": ["2020-12-28","2021-01-04"],
        "Value": [600,700]},
     "Water consumption": {
        "Datetime": ["2020-12-28","2021-01-04"],
        "Value": [800,500]}}
       }

I would like to transform the dictionary to a pandas data frame. I think the best way woul like to get something like :

enter image description here

Tried several ways without success, somebody do have a clue to how to solve this ?

CodePudding user response:

just use a list to store the buildings number and use the index to reference the building. You can also change the name of the index column to be Building. in other words you do not need Building 1, Building 2, ... , Building n

import pandas as pd

data = [

    {
        "Energy consumption": {
            "Datetime": ["2020-12-28","2021-01-04"],
            "Value": [537,967]},
        "Water consumption": {
            "Datetime": ["2020-12-28","2021-01-04"],
            "Value": [537,967]}
    },

    {
        "Energy consumption": {
            "Datetime": ["2020-12-28","2021-01-04"],
            "Value": [600,700]},
        "Water consumption": {
            "Datetime": ["2020-12-28","2021-01-04"],
            "Value": [800,500]}
    }
]

df = pd.DataFrame.from_dict(data)

print(df)

output:

                                  Energy consumption                                  Water consumption
0  {'Datetime': ['2020-12-28', '2021-01-04'], 'Va...  {'Datetime': ['2020-12-28', '2021-01-04'], 'Va...
1  {'Datetime': ['2020-12-28', '2021-01-04'], 'Va...  {'Datetime': ['2020-12-28', '2021-01-04'], 'Va...

CodePudding user response:

The structure of the nested dictionaries is a little intricate. You can first get the multi-index columns with unstack then transform the cell values to create a new dataframe:

series = pd.DataFrame(data).unstack()
s = series.apply(lambda x: dict(zip(x["Datetime"], x["Value"])))
pd.DataFrame(s.values.tolist(), index=series.index).T

Output:

                   Building 1                           Building 2                  
           Energy consumption Water consumption Energy consumption Water consumption
2020-12-28                537               537                600               800
2021-01-04                967               967                700               500
  • Related