Home > Back-end >  filter data based on a condition in json
filter data based on a condition in json

Time:01-20

I am working on a requirement where I need to filter data if a condition is satisfied from json data into a data frame in python. When I use the below code I run into following error. I am trying to filter data based on a random condition here, I am checking if the Country Code is US then I need data frame to be populated with all the records where country is USA.

Code:

import json

data = {
  "demographic": [
    {
      "id": 1,
      "country": {
        "code": "AU",
        "name": "Australia"
      },
      "state": {
        "name": "New South Wales"
      },
      "location": {
        "time_zone": {
          "name": "(UTC 10:00) Canberra, Melbourne, Sydney",
          "standard_name": "AUS Eastern Standard Time",
          "symbol": "AUS Eastern Standard Time"
        }
      },
      "address_info": {
        "address_1": "",
        "address_2": "",
        "city": "",
        "zip_code": ""
      }
    },
    {
      "id": 2,
      "country": {
        "code": "AU",
        "name": "Australia"
      },
      "state": {
        "name": "New South Wales"
      },
      "location": {
        "time_zone": {
          "name": "(UTC 10:00) Canberra, Melbourne, Sydney",
          "standard_name": "AUS Eastern Standard Time",
          "symbol": "AUS Eastern Standard Time"
        }
      },
      "address_info": {
        "address_1": "",
        "address_2": "",
        "city": "",
        "zip_code": ""
      }
    },
    {
      "id": 3,
      "country": {
        "code": "US",
        "name": "United States"
      },
      "state": {
        "name": "Illinois"
      },
      "location": {
        "time_zone": {
          "name": "(UTC-06:00) Central Time (US & Canada)",
          "standard_name": "Central Standard Time",
          "symbol": "Central Standard Time"
        }
      },
      "address_info": {
        "address_1": "",
        "address_2": "",
        "city": "",
        "zip_code": "60611"
      }
    }
  ]
}

jd = json.loads(data)
df = [cnt for cnt in jd["demographic"] if cnt["country"]["code"] == "US"]
print(df)

Error:

TypeError: string indices must be integers

CodePudding user response:

You don't need to parse a json string into python dict, cause data var is already a python dict!

Remove this line

jd = json.loads(data)

This is your code:

df = [cnt for cnt in data["demographic"] if cnt["country"]["code"] == "US"]
print(df)
  • Related