Home > Back-end >  Problem reading a nested json with Pandas Dataframe
Problem reading a nested json with Pandas Dataframe

Time:07-09

I have the Json file:

enter code here

[
   {
      "id":"62af9e713fb3db3e96905790",
      "idModel":"62458249eed0bf2860e472a1",
      "modelType":"board",
      "fieldGroup":"b0cddfb41bc9462347387c37077a2de5338a50c797e04f3cf0adb6ae2344e83c",
      "display":{
         "cardFront":true
      },
      "name":"Priority",
      "pos":16384,
      "options":[
         {
            "id":"62af9e713fb3db3e96905791",
            "idCustomField":"62af9e713fb3db3e96905790",
            "value":{
               "text":"Highest"
            },
            "color":"red",
            "pos":16384
         },
         {
            "id":"62af9e713fb3db3e96905792",
            "idCustomField":"62af9e713fb3db3e96905790",
            "value":{
               "text":"High"
            },
            "color":"orange",
            "pos":32768
         },
         {
            "id":"62af9e713fb3db3e96905793",
            "idCustomField":"62af9e713fb3db3e96905790",
            "value":{
               "text":"Medium"
            },
            "color":"yellow",
            "pos":49152
         },
         {
            "id":"62af9e713fb3db3e96905794",
            "idCustomField":"62af9e713fb3db3e96905790",
            "value":{
               "text":"Low"
            },
            "color":"sky",
            "pos":65536
         },
         {
            "id":"62af9e713fb3db3e96905795",
            "idCustomField":"62af9e713fb3db3e96905790",
            "value":{
               "text":"Lowest"
            },
            "color":"blue",
            "pos":81920
         },
         {
            "id":"62af9e713fb3db3e96905796",
            "idCustomField":"62af9e713fb3db3e96905790",
            "value":{
               "text":"Not sure"
            },
            "color":"none",
            "pos":98304
         }
      ],
      "type":"list",
      "isSuggestedField":false
   },
   {
      "id":"62af9e79b90b617ab506e07f",
      "idModel":"62458249eed0bf2860e472a1",
      "modelType":"board",
      "fieldGroup":"e8620327b22fc5e743a35720260ad7f03a0edaef6e64e3c47f37f04f09ce4747",
      "display":{
         "cardFront":true
      },
      "name":"Status",
      "pos":32768,
      "options":[
         {
            "id":"62af9e79b90b617ab506e080",
            "idCustomField":"62af9e79b90b617ab506e07f",
            "value":{
               "text":"To do"
            },
            "color":"orange",
            "pos":16384
         },
         {
            "id":"62af9e79b90b617ab506e081",
            "idCustomField":"62af9e79b90b617ab506e07f",
            "value":{
               "text":"In progress"
            },
            "color":"sky",
            "pos":32768
         },
         {
            "id":"62af9e79b90b617ab506e082",
            "idCustomField":"62af9e79b90b617ab506e07f",
            "value":{
               "text":"Done"
            },
            "color":"blue",
            "pos":49152
         },
         {
            "id":"62af9e79b90b617ab506e083",
            "idCustomField":"62af9e79b90b617ab506e07f",
            "value":{
               "text":"In review"
            },
            "color":"pink",
            "pos":65536
         },
         {
            "id":"62af9e79b90b617ab506e084",
            "idCustomField":"62af9e79b90b617ab506e07f",
            "value":{
               "text":"Approved"
            },
            "color":"purple",
            "pos":81920
         },
         {
            "id":"62af9e79b90b617ab506e085",
            "idCustomField":"62af9e79b90b617ab506e07f",
            "value":{
               "text":"Not sure"
            },
            "color":"none",
            "pos":98304
         }
      ],
      "type":"list",
      "isSuggestedField":false
   },
   {
      "id":"62af9e9aec64591850a2affc",
      "idModel":"62458249eed0bf2860e472a1",
      "modelType":"board",
      "fieldGroup":"6a3e7236de9b7cad94edc4eca26ef6d0950c7ce7c7de9eda04b8639f32e81756",
      "display":{
         "cardFront":true
      },
      "name":"Risk2",
      "pos":49152,
      "options":[
         {
            "id":"62af9e9aec64591850a2affd",
            "idCustomField":"62af9e9aec64591850a2affc",
            "value":{
               "text":"Vermelho"
            },
            "color":"red",
            "pos":16384
         },
         {
            "id":"62af9e9aec64591850a2affe",
            "idCustomField":"62af9e9aec64591850a2affc",
            "value":{
               "text":"Laranja"
            },
            "color":"orange",
            "pos":32768
         },
         {
            "id":"62af9e9aec64591850a2afff",
            "idCustomField":"62af9e9aec64591850a2affc",
            "value":{
               "text":"Amarelo"
            },
            "color":"yellow",
            "pos":49152
         },
         {
            "id":"62af9e9aec64591850a2b000",
            "idCustomField":"62af9e9aec64591850a2affc",
            "value":{
               "text":"Cian"
            },
            "color":"sky",
            "pos":65536
         },
         {
            "id":"62af9e9aec64591850a2b001",
            "idCustomField":"62af9e9aec64591850a2affc",
            "value":{
               "text":"Azul"
            },
            "color":"blue",
            "pos":81920
         },
         {
            "id":"62af9e9aec64591850a2b002",
            "idCustomField":"62af9e9aec64591850a2affc",
            "value":{
               "text":"Cinza"
            },
            "color":"none",
            "pos":98304
         }
      ],
      "type":"list",
      "isSuggestedField":false
   },
   {
      "id":"62af9e9b3d330332322d0b51",
      "idModel":"62458249eed0bf2860e472a1",
      "modelType":"board",
      "fieldGroup":"cbe95065aa1ceca2e0fe02597b0603b64ada5a3290a3fd182c67713972f62fa5",
      "display":{
         "cardFront":true
      },
      "name":"Effort",
      "pos":65536,
      "type":"number",
      "isSuggestedField":false
   },
   {
      "id":"62af9ea5d702f48beebb20a1",
      "idModel":"62458249eed0bf2860e472a1",
      "modelType":"board",
      "fieldGroup":"32e07b2bfb57633e77f8d8ce19d227a98965b1147d6732b76dd7a6a4521efdf6",
      "display":{
         "cardFront":true
      },
      "name":"Date",
      "pos":81920,
      "type":"date",
      "isSuggestedField":false
   },
   {
      "id":"62af9eae84137552f0b773b7",
      "idModel":"62458249eed0bf2860e472a1",
      "modelType":"board",
      "fieldGroup":"e5863f32271bf8a895f8d87c14223f0721b5bedfe41faed6040aa8c772593a8f",
      "display":{
         "cardFront":true
      },
      "name":"Number",
      "pos":98304,
      "type":"number",
      "isSuggestedField":false
   },
   {
      "id":"62af9eb55a33596436b9c7fa",
      "idModel":"62458249eed0bf2860e472a1",
      "modelType":"board",
      "fieldGroup":"516f920dcdac8fe5232e2b6c243fd54fde96b21f5540e4711c6a8c69e4a3a0f1",
      "display":{
         "cardFront":true
      },
      "name":"Text",
      "pos":114688,
      "type":"text",
      "isSuggestedField":false
   },
   {
      "id":"62b4365ca419438696c67670",
      "idModel":"62458249eed0bf2860e472a1",
      "modelType":"board",
      "fieldGroup":"e6ada93711c6506e4654e91f01a6bd417226703a20f7da551c66086d52e36579",
      "display":{
         "cardFront":true
      },
      "name":"Checkbox_field",
      "pos":131072,
      "type":"checkbox",
      "isSuggestedField":false
   },
   {
      "id":"62c22641b4b30a7a955300f4",
      "idModel":"62458249eed0bf2860e472a1",
      "modelType":"board",
      "fieldGroup":"a20f7a7e943e379714813aba0b53e2093e77c5736f18c61d08c193176bbbeafc",
      "display":{
         "cardFront":true
      },
      "name":"Risk",
      "pos":147456,
      "options":[
         {
            "id":"62c22641b4b30a7a955300f5",
            "idCustomField":"62c22641b4b30a7a955300f4",
            "value":{
               "text":"Highest"
            },
            "color":"red",
            "pos":16384
         },
         {
            "id":"62c22641b4b30a7a955300f6",
            "idCustomField":"62c22641b4b30a7a955300f4",
            "value":{
               "text":"High"
            },
            "color":"orange",
            "pos":32768
         },
         {
            "id":"62c22641b4b30a7a955300f7",
            "idCustomField":"62c22641b4b30a7a955300f4",
            "value":{
               "text":"Medium"
            },
            "color":"yellow",
            "pos":49152
         },
         {
            "id":"62c22641b4b30a7a955300f8",
            "idCustomField":"62c22641b4b30a7a955300f4",
            "value":{
               "text":"Low"
            },
            "color":"sky",
            "pos":65536
         },
         {
            "id":"62c22641b4b30a7a955300f9",
            "idCustomField":"62c22641b4b30a7a955300f4",
            "value":{
               "text":"Lowest"
            },
            "color":"blue",
            "pos":81920
         },
         {
            "id":"62c22641b4b30a7a955300fa",
            "idCustomField":"62c22641b4b30a7a955300f4",
            "value":{
               "text":"Not sure"
            },
            "color":"none",
            "pos":98304
         }
      ],
      "type":"list",
      "isSuggestedField":false
   },
   {
      "id":"62c227a0a9f6c20867a1f767",
      "idModel":"62458249eed0bf2860e472a1",
      "modelType":"board",
      "fieldGroup":"2a9bc4c06cef764be97bdd47ab397c597d2e1d8f4d54b48c84eafae36ed21e63",
      "display":{
         "cardFront":true
      },
      "name":"Risk3",
      "pos":163840,
      "options":[
         {
            "id":"62c227a0a9f6c20867a1f768",
            "idCustomField":"62c227a0a9f6c20867a1f767",
            "value":{
               "text":"Alto"
            },
            "color":"none",
            "pos":16384
         }
      ],
      "type":"list",
      "isSuggestedField":false
   },
   {
      "id":"62c2996148e38f2bf09e1088",
      "idModel":"62458249eed0bf2860e472a1",
      "modelType":"board",
      "fieldGroup":"dc10651a093834befed19308dd27fbdb6f552375b4e20e9b1e49fbd5dce97c2d",
      "display":{
         "cardFront":true
      },
      "name":"Dropdown",
      "pos":180224,
      "options":[
         {
            "id":"62c2996148e38f2bf09e1089",
            "idCustomField":"62c2996148e38f2bf09e1088",
            "value":{
               "text":"Jhon"
            },
            "color":"red",
            "pos":16384
         },
         {
            "id":"62c2996148e38f2bf09e108a",
            "idCustomField":"62c2996148e38f2bf09e1088",
            "value":{
               "text":"Paul"
            },
            "color":"green",
            "pos":32768
         },
         {
            "id":"62c2996148e38f2bf09e108b",
            "idCustomField":"62c2996148e38f2bf09e1088",
            "value":{
               "text":"Ringo"
            },
            "color":"sky",
            "pos":49152
         },
         {
            "id":"62c2996148e38f2bf09e108c",
            "idCustomField":"62c2996148e38f2bf09e1088",
            "value":{
               "text":"George"
            },
            "color":"none",
            "pos":65536
         }
      ],
      "type":"list",
      "isSuggestedField":false
   }
]

I would like to extract the nested data at the 'options' level. The problem is that I only manage to extract a part of that data. I attach the code :

import pandas as pd
import json

with open('response.json') as f:
    data = json.load(f)
    df = pd.DataFrame(data)
    df1 = pd.DataFrame(data[0]['options'])
    df2 = pd.DataFrame(data[1]['options'])
    df3 = pd.DataFrame(data[2]['options'])

    df_concat = pd.concat([df1, df2, df3])
    print(df_concat)

And I get the following result:

enter image description here

But there is a part of that information that I am not able to extract as Pandas DataFrame, specifically this part of the json file:

 {
      "id":"62c227a0a9f6c20867a1f767",
      "idModel":"62458249eed0bf2860e472a1",
      "modelType":"board",
      "fieldGroup":"2a9bc4c06cef764be97bdd47ab397c597d2e1d8f4d54b48c84eafae36ed21e63",
      "display":{
         "cardFront":true
      },
      "name":"Risk3",
      "pos":163840,
      "options":[
         {
            "id":"62c227a0a9f6c20867a1f768",
            "idCustomField":"62c227a0a9f6c20867a1f767",
            "value":{
               "text":"Alto"
            },
            "color":"none",
            "pos":16384
         }
      ],
      "type":"list",
      "isSuggestedField":false
   },
   {
      "id":"62c2996148e38f2bf09e1088",
      "idModel":"62458249eed0bf2860e472a1",
      "modelType":"board",
      "fieldGroup":"dc10651a093834befed19308dd27fbdb6f552375b4e20e9b1e49fbd5dce97c2d",
      "display":{
         "cardFront":true
      },
      "name":"Dropdown",
      "pos":180224,
      "options":[
         {
            "id":"62c2996148e38f2bf09e1089",
            "idCustomField":"62c2996148e38f2bf09e1088",
            "value":{
               "text":"Jhon"
            },
            "color":"red",
            "pos":16384
         },
         {
            "id":"62c2996148e38f2bf09e108a",
            "idCustomField":"62c2996148e38f2bf09e1088",
            "value":{
               "text":"Paul"
            },
            "color":"green",
            "pos":32768
         },
         {
            "id":"62c2996148e38f2bf09e108b",
            "idCustomField":"62c2996148e38f2bf09e1088",
            "value":{
               "text":"Ringo"
            },
            "color":"sky",
            "pos":49152
         },
         {
            "id":"62c2996148e38f2bf09e108c",
            "idCustomField":"62c2996148e38f2bf09e1088",
            "value":{
               "text":"George"
            },
            "color":"none",
            "pos":65536
         }
      ],
      "type":"list",
      "isSuggestedField":false
   }

Does anyone have any suggestions?

Thanks in advance.

CodePudding user response:

You should take a look at the json_normalize()) method, here is an example of its use

import json
def create_ais_file(self) -> pd.DataFrame:
    '''Creates a DataFrame of AIS for a given list of IMOS'''
    df = pd.DataFrame(columns=['VesselImo','VesselId', 'VesselName', 'VesselMmsi', 'MessageTimestamp', 'Latitude', 'Longitude', 'Speed', 'Cog', 'Rot', 'Heading', 'NearestPlace', 'NearestPlaceId', 'NearestCountry', 'Distance', 'Destination', 'Eta', 'Draft', 'Dimensions', 'NavStatus', 'TypeCargo', 'Source','ScrollId','TotalRecords'])
    for imos in self.imos_list:
        ais = self.get_position(imos)
        ais = pd.json_normalize(ais, 'AisPositions', ['ScrollId', 'TotalRecords'])
        df = pd.concat([df,ais])
    df.to_parquet('vessels_location.parquet')
    return df

CodePudding user response:

I try it, but no working. I have the same problem:

from pandas.io.json import json_normalize

df_json1 = json_normalize(data[0]['options'])
df_json2 = json_normalize(data[1]['options'])
df_json3 = json_normalize(data[2]['options'])
### concatenate all dataframes ###
df_json = pd.concat([df_json1, df_json2, df_json3])
### print dataframe ###
print(df_json)

Missing data from the 'options' column:

enter image description here

CodePudding user response:

I am beginning to suspect that it is because I have 'Nan' values in the 'options' column.

enter image description here

Any idea how to solve this?

  • Related