Here's the gist, I'm using Django to fill a PostgreSQL database to store user data from a third-party API. I'm using an API to get the data into Django so that I can automate the filling of the DB. I have the models built for the fields that need stored.
Here's where I need some help. I've create a list from an API response but I want to remove duplicate users and combine the lists, like this.
What I have now.
{
"person_id": "1",
"account": "5",
"list": "c"
},
{
"person_id": "1",
"account": "5",
"list": "b"
},
{
"person_id": "1",
"account": "5",
"list": "a"
},
...
What i want
{
"person_id": "1",
"account": "5",
"list": ["a","b","c"]
},
{
"person_id": "2",
"account": "5",
"list": ["a","c"]
},
{
"person_id": "3",
"account": "5"
"list": ["a","b"]
},
...
one API call I'm making is to get all users in a list and responds with:
API RESPONSE
{
"records": [
{
"id": "asdafdgsdfhsdfh",
"email": "[email protected]",
"phone_number": " 1123123"
},
{
"id": "asdafdgsdfhsdfh",
"email": "[email protected]",
"phone_number": " 1123123"
},
...
],
"marker":342523452
}
From that response I am iterating over each record and creating a dict to add to a list.
def personA():
return dict(
person_id = record["id"],
account = account,
list = list
r = requests.request('GET',f"{link}")
resdata =r.json()
for r in resdata:
for record in resdata["records"]:
listC = personA()
listData.append(listC)
)
I am doing this for each list in the account, so some "person_id"'s show up many times, and some only once.
What would be the best way for me to create a list in the way that I'm going for?
CodePudding user response:
A dictionary will do here:
# Grab the account_id from the first element in data. I'm assuming that the account names are the same across all data points, but this is not a serious concern per the comment.
account_id = data[0]["account"]
# We maintain a dictionary that maps from a person_id to a list of strings appearing in the list field for a given person_id.
person_id_elements = {}
# Read each data point into our dictionary.
for data_point in data:
person_id = data_point["person_id"]
list_element = data_point["list"]
if person_id not in person_id_elements:
person_id_elements[person_id] = []
person_id_elements[person_id].append(list_element)
# Transform id_data into objects that observe the desired schema.
result = []
for person_id in person_id_elements:
result.append({
"person_id": person_id,
"account": account_id,
"list": sorted(person_id_elements[person_id]) # Sorted, as shown in the sample output.
})
print(result)
CodePudding user response:
You can create a mapping of the persons to the lists they appear in:
from collections import defaultdict
records = defaultdict(list)
for record in resdata["records"]:
key = record["person_id"], record["account"]
value = record["list"]
records[key].append(value)
results = [
{
"person_id": person_id,
"account": account,
"list": lists
} for (person_id, account), lists in records.items()
]