Assume I have this 2 list of dictionnary:
Dict1=[
{"name": "James","age": "21"},
{"name": "Evelyn","age": "28"},
{"name": "William","age": "31"}
]
Dict2=[
{"name": "James","age": "21","city":"NYC","Gender":"M"},
{"name": "William","age": "25","city":"NYC","Gender":"M"},
{"name": "Ella","age": "17","city":"NYC","Gender":"F"}
]
How can i compare them based on name key only and get the whole entry from dict2, in this case the result must be:
new_dict=[
{"name": "James","age": "21","city":"NYC","Gender":"M"},
{"name": "William","age": "25","city":"NYC","Gender":"F"},
]
I am looking for a method to make this comparison based on one or many selected key and without using for loop.
CodePudding user response:
What you want to do is impossible without looping. You can eventually hide the looping by using a library or a complex way, but the simplest is to loop.
Now there is the bad way and the good way to loop. The bad one would be to check each element of Dict1 for each element of Dict2 (O(n*m) complexity).
Good one would be to construct a set
index of the names in Dict1 and use them to match the names of Dict2:
names = set(d['name'] for d in Dict1)
new_dict = [d for d in Dict2 if d['name'] in names]
output:
>>> new_dict
[{'name': 'James', 'age': '21', 'city': 'NYC', 'Gender': 'M'},
{'name': 'William', 'age': '25', 'city': 'NYC', 'Gender': 'M'}]
CodePudding user response:
- Just for clarity, you named your lists "Dicts". This may get confusing in the future.
- Assuming that the names are unique, you could use dictionaries like so:
d1 = {person['name']: person for person in Dict1}
d2 = {person['name']: person for person in Dict2}
result = [person for name, person in d2.items() if name in d1]
# result is:
[
{'name': 'James', 'age': '21', 'city': 'NYC', 'Gender': 'M'},
{'name': 'William', 'age': '25', 'city': 'NYC', 'Gender': 'M'}
]
There are other ways to achieve that, some are even more efficient, but if you need readability, this may help.
Edit: This obviously uses loops, but what you request can not be done without iterating over the collections. So I assume that you just want to avoid a for
block.
CodePudding user response:
A little convoluted but you can use pandas to avoid using a loop:
import pandas as pd
Dict1=[
{"name": "James","age": "21"},
{"name": "Evelyn","age": "28"},
{"name": "William","age": "31"}
]
Dict2=[
{"name": "James","age": "21","city":"NYC","Gender":"M"},
{"name": "William","age": "25","city":"NYC","Gender":"M"},
{"name": "Ella","age": "17","city":"NYC","Gender":"F"}
]
df1 = pd.DataFrame(Dict1)
df2 = pd.DataFrame(Dict2)
df = df2[df2['name'].isin(df1['name'])]
print(df.to_dict(orient="records"))
Output:
[{'name': 'James', 'age': '21', 'city': 'NYC', 'Gender': 'M'}, {'name': 'William', 'age': '25', 'city': 'NYC', 'Gender': 'M'}]