I tiring to filter a list of dictionary based on some filter criteria.
Unfiltered Records:
records = [
{"category": "aircraft", "model": "F35", "type": "", "year": "2019", "sc": "*"},
{"category": "aircraft", "model": "airline", "type": "", "year": "2019", "sc": "*"},
{"category": "automobile", "model": "", "type": "F35", "year": "2019", "sc": "*"},
{"category": "aircraft", "model": "F35", "type": "", "year": "2022", "sc": "1"},
{"category": "aircraft", "model": "airline", "type": "", "year": "2022", "sc": "*"},
{"category": "automobile", "model": "", "type": "F35", "year": "2022", "sc": "*"},
{"category": "aircraft", "model": "F35", "type": "", "year": "2019", "sc": "23"},
]
filter criteria:
search_terms = {
"category": ["aircraft"],
"model": ["F35"],
"year": ["2019"],
"sc": ["23","1"]
}
expected output:
expected_output= [
{'category': 'aircraft', 'model': 'F35', 'type': '', 'year': '2019', 'sc': '23'}
]
if any of the search_terms is not matching then we need to search for "*" for that particular key
My Code:
records = [
{"category": "aircraft", "model": "F35", "type": "", "year": "2019", "sc": "*"},
{"category": "aircraft", "model": "airline", "type": "", "year": "2019", "sc": "*"},
{"category": "automobile", "model": "", "type": "F35", "year": "2019", "sc": "*"},
{"category": "aircraft", "model": "F35", "type": "", "year": "2022", "sc": "1"},
{"category": "aircraft", "model": "airline", "type": "", "year": "2022", "sc": "*"},
{"category": "automobile", "model": "", "type": "F35", "year": "2022", "sc": "*"},
{"category": "aircraft", "model": "F35", "type": "", "year": "2019", "sc": "23"},
]
search_terms = {
"category": ["aircraft"],
"model": ["F35"],
"year": ["2019"],
"sc": ["23","1"]
}
expected_output = list(filter(lambda item: (item["category"] in search_terms.get("category") or item["category"] == "*")
and (item["model"] in search_terms.get("model") or item["model"] == "*")
and (item["year"] in search_terms.get("year") or item["year"] == "*")
and (item["sc"] in search_terms.get("sc") or item["sc"] == "*"), records))
for output in expected_output:
print(output)
my output:
[{'category': 'aircraft', 'model': 'F35', 'type': '', 'year': '2019', 'sc': '*'}
{'category': 'aircraft', 'model': 'F35', 'type': '', 'year': '2019', 'sc': '23'}]
the output should not contain the dictionary with sc = "*" as "23" is already present. if "23" was not there in the records the output would be
[{'category': 'aircraft', 'model': 'F35', 'type': '', 'year': '2019', 'sc': '*'}]
"*" has the least privilege.
Can someone help what I'm missing here?
CodePudding user response:
Let's try in this way:
def filter_data():
return [
d for d in records
if all(d[k] in v for k, v in search_terms.items())
]
print(filter_data())
#Output:
[{'category': 'aircraft', 'model': 'F35', 'type': '', 'year': '2019', 'sc': '23'}]
CodePudding user response:
Move or
to the outmost if I understand correctly, i.e., if the expected output is empty in the case of exact match, perform fuzzy matching. Besides, you can simplify the code, like this.
# extend search_terms with '*' for the fuzzy matching
ex_search_terms = { key: val ['*'] for key, val in search_terms.items() }
filt_fn = lambda record, search_terms: all(record[key] in search_terms[key] for key in search_terms.keys())
expected_output = [ record for record in records if filt_fn(record, search_terms) ] or \
[ record for record in records if filt_fn(record, ex_search_terms) ]