Home > OS >  Conversion of double nested dictionary to dataframe
Conversion of double nested dictionary to dataframe

Time:08-16

Could anyone tell me how to get the data from the below nested dict into a table with following columns [unit, siteId, date, value,]

data = {'sitesEnergy': {'timeUnit': 'DAY',
  'unit': 'Wh',
  'count': 5,
  'siteEnergyList': [{'siteId': 2248407,
    'energyValues': {'measuredBy': 'METER',
     'values': 
      [
      {'date': '2022-08-01 00:00:00', 'value': 1084070.0},
      {'date': '2022-08-02 00:00:00', 'value': 1420093.0},
      {'date': '2022-08-03 00:00:00', 'value': 1757618.0},
      {'date': '2022-08-04 00:00:00', 'value': 1685625.0},
      {'date': '2022-08-05 00:00:00', 'value': 1043790.0},
      {'date': '2022-08-06 00:00:00', 'value': 1340688.0},
      {'date': '2022-08-07 00:00:00', 'value': 1555515.0},
      {'date': '2022-08-08 00:00:00', 'value': 1573906.0}]}},
   {'siteId': 1485192,
    'energyValues': {'measuredBy': 'METER',
     'values': 
      [
      {'date': '2022-08-01 00:00:00', 'value': 230484.0},
      {'date': '2022-08-02 00:00:00', 'value': 272969.0},
      {'date': '2022-08-03 00:00:00', 'value': 302500.0},
      {'date': '2022-08-04 00:00:00', 'value': 300594.0},
      {'date': '2022-08-05 00:00:00', 'value': 220641.0},
      {'date': '2022-08-06 00:00:00', 'value': 255484.0},
      {'date': '2022-08-07 00:00:00', 'value': 244516.0},
      {'date': '2022-08-08 00:00:00', 'value': 266532.0}]}}]}}

CodePudding user response:

if exactly this dict:

import pandas as pd
df = pd.DataFrame(data['sitesEnergy']['siteEnergyList'][0]['energyValues']['values'])
df['unit'] = data['sitesEnergy']['unit']
df['siteId'] = data['sitesEnergy']['siteEnergyList'][0]['siteId']

UPDATE:

import pandas as pd
df = pd.DataFrame([[d['siteId'], d['energyValues']['values']] for d in data['sitesEnergy']['siteEnergyList']]).explode(1)
df = pd.concat([df[[0]].rename(columns={0: "siteId"}).reset_index(drop=True), pd.DataFrame(df[1].tolist())], axis=1)
df['unit'] =  data['sitesEnergy']['unit']

CodePudding user response:

Not sure exactly what you are after,

but in case this is your dict:

dict = \
{'unit': 'Wh',
  'count': 5,
  'siteEnergyList': {'siteId': 2248407,
    'energyValues': {'measuredBy': 'METER',
     'values': 
     [{'date': '2022-08-01 00:00:00', 'value': 1084070.0},
      {'date': '2022-08-02 00:00:00', 'value': 1420093.0},
      {'date': '2022-08-03 00:00:00', 'value': 1757618.0},
      {'date': '2022-08-04 00:00:00', 'value': 1685625.0},
      {'date': '2022-08-05 00:00:00', 'value': 1043790.0},
      {'date': '2022-08-06 00:00:00', 'value': 1340688.0},
      {'date': '2022-08-07 00:00:00', 'value': 1555515.0},
      {'date': '2022-08-08 00:00:00', 'value': 1573906.0}]}}}

and this is what you are after:

    unit siteId date                value
0   Wh  2248407 2022-08-01 00:00:00 1084070.0
1   Wh  2248407 2022-08-02 00:00:00 1420093.0
2   Wh  2248407 2022-08-03 00:00:00 1757618.0
3   Wh  2248407 2022-08-04 00:00:00 1685625.0
4   Wh  2248407 2022-08-05 00:00:00 1043790.0
5   Wh  2248407 2022-08-06 00:00:00 1340688.0
6   Wh  2248407 2022-08-07 00:00:00 1555515.0
7   Wh  2248407 2022-08-08 00:00:00 1573906.0

you could:

df = pd.DataFrame(dict['siteEnergyList']['energyValues']['values'])
df.insert(0, 'siteId', dict['siteEnergyList']['siteId'])
df.insert(0, 'unit', dict['unit'])
  • Related