Home > OS >  How to get a specific key value from a nested dictionary in Pandas?
How to get a specific key value from a nested dictionary in Pandas?

Time:10-17

I have a nested JSON-file that looks like this:

[
   {
      "IsRecentlyVerified": true,
      "AddressInfo": {
          "Town": "Haarlem",
      },
      "Connections": [
          {
             "PowerKW": 17,
             "Quantity": 2
          }
       ],
       "NumberOfPoints": 1,
    },
    {
      "IsRecentlyVerified": true,
      "AddressInfo": {
          "Town": "Haarlem",
      },
      "Connections": [
          {
             "PowerKW": 17,
             "Quantity": 1
          },
          {
             "PowerKW": 17,
             "Quantity": 1
          },
          {
             "PowerKW": 17,
             "Quantity": 1
          }
       ],
       "NumberOfPoints": 1,
    }
]

As you can see, the list of this JSON-file consists of two dictionaries that each contains another list (= "Connections") that consists of at least one dictionary. In each dictionary of this JSON-file, I want to select all keys named "Quantity" to make a calculation with its value (so in the example code above, I want to calculate that there are 5 Quantities in total).

With the code below, I created a simple dataframe in Pandas to make this calculation :

import json
import pandas as pd

df = pd.read_json("chargingStations.json")

dfConnections = df["Connections"]
dfConnections = pd.json_normalize(dfConnections)

print(dfConnections)

Which results in: enter image description here

Ideally, I want to get the "Quantity" key from each dictionary, so that I can make a dataframe like this (where each item has its own row): enter image description here

However, I am not sure if this is the best way to make my calculation. I tried to get each value of the "Quantity" key by typing dfConnections = dfConnections.get("Quantity"), but that results in None. So: how can I get the value of each "Quantity" key in each dictionary to make my calculation?

CodePudding user response:

If data is parsed Json data from your question, you can do:

df = pd.DataFrame(
    [
        {
            i: sum(dd["Quantity"] for dd in d["Connections"])
            for i, d in enumerate(data)
        }
    ]
)
print(df)

Prints:

0 1
0 2 3

CodePudding user response:

you can use json_normalize():

import pandas as pd
true=True
a=[
   {
      "IsRecentlyVerified": true,
      "AddressInfo": {
          "Town": "Haarlem",
      },
      "Connections": [
          {
             "PowerKW": 17,
             "Quantity": 2
          }
       ],
       "NumberOfPoints": 1,
    },
      {
      "IsRecentlyVerified": true,
      "AddressInfo": {
          "Town": "Haarlem",
      },
      "Connections": [
          {
             "PowerKW": 17,
             "Quantity": 2
          }
       ],
       "NumberOfPoints": 1,
    },
         {
      "IsRecentlyVerified": true,
      "AddressInfo": {
          "Town": "Haarlem",
      },
      "Connections": [
          {
             "PowerKW": 17,
             "Quantity": 2
          }
       ],
       "NumberOfPoints": 1,
    },
    {
      "IsRecentlyVerified": true,
      "AddressInfo": {
          "Town": "Haarlem",
      },
      "Connections": [
          {
             "PowerKW": 17,
             "Quantity": 1
          },
          {
             "PowerKW": 17,
             "Quantity": 1
          },
          {
             "PowerKW": 17,
             "Quantity": 1
          },
          {
             "PowerKW": 17,
             "Quantity": 1
          },
          {
             "PowerKW": 17,
             "Quantity": 1
          }
       ],
       "NumberOfPoints": 1,
    }
]

After reading the data, we use a group by function to index numbers and get the sum of the quantity.

df=pd.json_normalize(a)
df=df.explode('Connections')
df=df.join(pd.json_normalize(df.pop('Connections')))
df=df.reset_index().groupby('index')['Quantity'].sum().to_frame()
print(df)
'''
    index   Quantity
0   0       2
1   1       3
'''
#or another format
df2=df.T
print(df2)
'''
            0   1
Quantity    2   3

'''

  • Related