I have a list of dictionaries where I want to get a new list of dictionaries with unique two keys: 1. City, 2. Country.
list = [
{ City: "Gujranwala", Country: "Pakistan", other_columns },
{ City: "Gujrwanala", Country: "India", other_columns },
{ City: "Gujranwala", Country: "Pakistan", other_columns }
]
The output should be:
list = [
{ City: "Gujranwala", Country: "Pakistan", other_columns },
{ City: "Gujrwanala", Country: "India", other_columns }
]
CodePudding user response:
You can first extract the key-value pairs from the dicts and then remove duplicates by using a set. So you can do something like this:
- Convert dicts into a list of dict_items:
dict_items = [tuple(d.items()) for d in lst] # they need to be tuples, otherwise you wouldn't be able to cast the list to a set
- Deduplicate:
deduplicated = set(dict_items)
- Convert the dict_items back to dicts:
back_to_dicts = [dict(i) for i in deduplicated]
CodePudding user response:
One way to do this reduction is to have a dictionary with a unique key for every city
, country
combination. In my case I've just concatenated both those properties for the key which is a simple working solution.
We are using a dictionary here as the lookup on a dictionary happens in constant time, so the whole algorithm will run in O(n)
.
lst = [
{"City": "Gujranwala", "Country": "Pakistan"},
{"City": "Gujrwanala", "Country": "India"},
{"City": "Gujranwala", "Country": "Pakistan"}
]
unique = dict()
for item in lst:
# concatenate key
key = f"{item['City']}{item['Country']}"
# only add the value to the dictionary if we do not already have an item with this key
if not key in unique:
unique[key] = item
# get the dictionary values (we don't care about the keys now)
result = list(unique.values())
print(result)
Expected ouput:
[{'City': 'Gujranwala', 'Country': 'Pakistan'}, {'City': 'Gujrwanala', 'Country': 'India'}]
CodePudding user response:
I'm sure there are many other and probably better approaches to this problem, but you can use:
l = [
{ "City": "Gujranwala", "Country": "Pakistan" },
{ "City": "Gujrwanala", "Country": "India" },
{ "City": "Gujranwala", "Country": "Pakistan" }
]
ll, v = [], []
for d in l:
k = d["City"] d["Country"]
if not k in v:
v.append(k)
ll.append(d)
print(ll)
# [{'City': 'Gujranwala', 'Country': 'Pakistan'}, {'City': 'Gujrwanala', 'Country': 'India'}]`
We basically create a list with unique values containing the city and country that we use to verify if both values are already present on the final list.