Home > OS >  How to group a json by a nested key using Python?
How to group a json by a nested key using Python?

Time:07-28

Lets say we have a json object in Python:

myJson = [
    {
        "id": "123",
        "name": "alex",
        "meta": {
                    "city": "boston"
        }
    },
    {
        "id": "234",
        "name": "mike",
        "meta": {
                    "city": "seattle"
        }
    },
    {
        "id": "345",
        "name": "jess",
        "meta": {
                    "city": "boston"
        }
    }
]

What is the most efficient way to group this data by city, so that we end up with a json in which we group the data by city such that we end up with a json as:

myNewJson = [
    {
     "city": "boston",
     "people": [ ... ... ]
    },
    {
     "city": "seattle",
     "people": [ ... ]
    }
]

... in which the content of the people are included in "people" key.

Thanks!

CodePudding user response:

Seems like a dictionary could work. Use city names as the keys, and a list as the value. Then at the end, go through the dictionary and convert it to a list.

myJson = [
    {
        "id": "123",
        "name": "alex",
        "meta": {
                    "city": "boston"
        }
    },
    {
        "id": "234",
        "name": "mike",
        "meta": {
                    "city": "seattle"
        }
    },
    {
        "id": "345",
        "name": "jess",
        "meta": {
                    "city": "boston"
        }
    }
]

d = dict() # dictionary of {city: list of people}
for e in myJson:
  city = e['meta']['city']
  if city not in d:
    d[city] = list()
  d[city].append(e['name'])
  
# convert dictionary to list of json
result = list()
for key, val in d.items():
  result.append({'city': key, 'people': val})

print(result)

CodePudding user response:

Try:

myJson = [
    {"id": "123", "name": "alex", "meta": {"city": "boston"}},
    {"id": "234", "name": "mike", "meta": {"city": "seattle"}},
    {"id": "345", "name": "jess", "meta": {"city": "boston"}},
]

out = {}
for d in myJson:
    out.setdefault(d["meta"]["city"], []).append(d["name"])

out = [{"city": k, "people": v} for k, v in out.items()]
print(out)

Prints:

[
    {"city": "boston", "people": ["alex", "jess"]},
    {"city": "seattle", "people": ["mike"]},
]
  • Related