Home > Back-end >  Parsing a list of dictionaries to a table/csv
Parsing a list of dictionaries to a table/csv

Time:09-22

I have a dictionary containing nested lists of dictionaries that I want to flatten and unpack to a csv file. The keys of the initial dictionary (called 'readings') are a list of physical place identitifers (e.g. 111, 222, etc.)

list_of_attributes = readings.keys()
print(list_of_attributes)

result:

dict_keys(['000111', '000967', '000073' ...

For every key there is a nested list of further dictionaries containing time-series like data:

list(readings.values())[0]
[{'Approval': {'ApprovalLevel': 650, 'LevelDescription': 'Working'},
  'FieldVisitIdentifier': '2a328f51-aac7-494d-9f08-69716a06b179',
  'Value': {'Unit': 'ft', 'Numeric': -1.53},
  'DatumConvertedValues': [{'TargetDatum': 'NAVD88',
    'Unit': 'ft',
    'Numeric': 4.196010479999999},
   {'TargetDatum': 'NGVD29', 'Unit': 'ft', 'Numeric': 4.779999999999999},
   {'TargetDatum': 'Local Assumed Datum', 'Unit': 'ft', 'Numeric': -0.28}],
  'Parameter': 'GWL',
  'MonitoringMethod': 'Steel Tape',
  'Time': '2021-05-25T18:51:00.0000000 00:00',
  'Comments': 'MP = hole in top of SS =  1.25 ft LSD',
  'Publish': True,
  'ReadingType': 'Routine',
  'ReferencePointUniqueId': '13208d3d0fe846c08ba7d6777b040433',
  'UseLocationDatumAsReference': False},

...

I've had some success in using Amir Ziai's flatten_json package to flatten and unpack the nested lists of dictionaries out to a csv with the following:

data = {(k, i): flatten(item) for k, v in readings.items() for i, item in enumerate(v)}
df = pd.DataFrame.from_dict(data, orient='index')
df.to_csv(r"C:\test.csv", index=False)

Resultant csv:

enter image description here

This is exactly what I want but I also need to somehow map the intial keys ('000111', '000967', '000073' ) to a column for every associated record. That would look like this

enter image description here

UPDATE: json sample from endpoint

Response Body
{
  "FieldVisitReadings": [
    {
      "Approval": {
        "ApprovalLevel": 650,
        "LevelDescription": "Working"
      },
      "FieldVisitIdentifier": "2a328f51-aac7-494d-9f08-69716a06b179",
      "Value": {
        "Unit": "ft",
        "Numeric": -1.53
      },
      "DatumConvertedValues": [
        {
          "TargetDatum": "NAVD88",
          "Unit": "ft",
          "Numeric": 4.196010479999999
        },
        {
          "TargetDatum": "NGVD29",
          "Unit": "ft",
          "Numeric": 4.779999999999999
        },
        {
          "TargetDatum": "Local Assumed Datum",
          "Unit": "ft",
          "Numeric": -0.28
        }
      ],
      "Parameter": "GWL",
      "MonitoringMethod": "Steel Tape",
      "Time": "2021-05-25T18:51:00.0000000 00:00",
      "Comments": "MP = hole in top of SS =  1.25 ft LSD",
      "Publish": true,
      "ReadingType": "Routine",
      "ReferencePointUniqueId": "13208d3d0fe846c08ba7d6777b040433",
      "UseLocationDatumAsReference": false
    },

UPDATE 2:

# Step 1: Fetch the list of locations
locationDescriptions = client.publish.get('/GetLocationist')['LocationDescriptions']



#Pass list of identifiders to 2nd endpoint  and create dictionary.   Keys are list of locations from Step 1
readings = dict(
    (loc['Identifier'],
     client.publish.get('/GetReadingsByLocation', params={
        'LocationIdentifier': loc['Identifier'],
        'Parameters': ['GWL'],
        'ApplyDatumConversion': 'true'
     })['FieldVisitReadings']
    ) for loc in locationDescriptions)

CodePudding user response:

You can use DataFrame.drop_level() DataFrame.reset_index() to get your column:

import pandas as pd
from flatten_json import flatten

readings = {
    "000111": [
        {
            "Approval": {"ApprovalLevel": 650, "LevelDescription": "Working"},
            "FieldVisitIdentifier": "2a328f51-aac7-494d-9f08-69716a06b179",
            "Value": {"Unit": "ft", "Numeric": -1.53},
            "DatumConvertedValues": [
                {
                    "TargetDatum": "NAVD88",
                    "Unit": "ft",
                    "Numeric": 4.196010479999999,
                },
                {
                    "TargetDatum": "NGVD29",
                    "Unit": "ft",
                    "Numeric": 4.779999999999999,
                },
                {
                    "TargetDatum": "Local Assumed Datum",
                    "Unit": "ft",
                    "Numeric": -0.28,
                },
            ],
            "Parameter": "GWL",
            "MonitoringMethod": "Steel Tape",
            "Time": "2021-05-25T18:51:00.0000000 00:00",
            "Comments": "MP = hole in top of SS =  1.25 ft LSD",
            "Publish": True,
            "ReadingType": "Routine",
            "ReferencePointUniqueId": "13208d3d0fe846c08ba7d6777b040433",
            "UseLocationDatumAsReference": False,
        },
        {
            "Approval": {"ApprovalLevel": 1200, "LevelDescription": "Working"},
            "FieldVisitIdentifier": "xxx",
            "Value": {"Unit": "ft", "Numeric": -1.53},
            "DatumConvertedValues": [
                {
                    "TargetDatum": "NAVD88",
                    "Unit": "ft",
                    "Numeric": 4.196010479999999,
                },
                {
                    "TargetDatum": "NGVD29",
                    "Unit": "ft",
                    "Numeric": 4.779999999999999,
                },
                {
                    "TargetDatum": "Local Assumed Datum",
                    "Unit": "ft",
                    "Numeric": -0.28,
                },
            ],
            "Parameter": "GWL",
            "MonitoringMethod": "Steel Tape",
            "Time": "2021-05-25T18:51:00.0000000 00:00",
            "Comments": "MP = hole in top of SS =  1.25 ft LSD",
            "Publish": True,
            "ReadingType": "Routine",
            "ReferencePointUniqueId": "13208d3d0fe846c08ba7d6777b040433",
            "UseLocationDatumAsReference": False,
        },
    ]
}

data = {
    (k, i): flatten(item)
    for k, v in readings.items()
    for i, item in enumerate(v)
}
df = pd.DataFrame.from_dict(data, orient="index")
df = df.droplevel(level=1).reset_index().rename(columns={"index": "Keys"})
print(df)
df.to_csv("data.csv", index=False)

Prints:

     Keys  Approval_ApprovalLevel Approval_LevelDescription                  FieldVisitIdentifier Value_Unit  Value_Numeric DatumConvertedValues_0_TargetDatum DatumConvertedValues_0_Unit  DatumConvertedValues_0_Numeric DatumConvertedValues_1_TargetDatum DatumConvertedValues_1_Unit  DatumConvertedValues_1_Numeric DatumConvertedValues_2_TargetDatum DatumConvertedValues_2_Unit  DatumConvertedValues_2_Numeric Parameter MonitoringMethod                               Time                               Comments  Publish ReadingType            ReferencePointUniqueId  UseLocationDatumAsReference
0  000111                     650                   Working  2a328f51-aac7-494d-9f08-69716a06b179         ft          -1.53                             NAVD88                          ft                         4.19601                             NGVD29                          ft                            4.78                Local Assumed Datum                          ft                           -0.28       GWL       Steel Tape  2021-05-25T18:51:00.0000000 00:00  MP = hole in top of SS =  1.25 ft LSD     True     Routine  13208d3d0fe846c08ba7d6777b040433                        False
1  000111                    1200                   Working                                   xxx         ft          -1.53                             NAVD88                          ft                         4.19601                             NGVD29                          ft                            4.78                Local Assumed Datum                          ft                           -0.28       GWL       Steel Tape  2021-05-25T18:51:00.0000000 00:00  MP = hole in top of SS =  1.25 ft LSD     True     Routine  13208d3d0fe846c08ba7d6777b040433                        False

and saves data.csv:

enter image description here

  • Related