Home > OS >  What is the best way to identify what matches the text based on list of dictionaries
What is the best way to identify what matches the text based on list of dictionaries

Time:09-08

I want to know what is the best approach that can indicate for me the best match for text based on list of dictionaries. I have tried using any() but it didn’t work and I even tried to make a normal loop statement that goes through the loop and if value == text return text but it didn’t work as expected

data=[{
"categories":"application update",
"options":["application","update computer application","update computer application","update application computer"],    
},
{
"categories":"computer information",
"options":["computer information","computer information","computer properties"],    
},
{
"categories":"operating system",
"options":["operating system","update computer operating system","computer operating system"],    
},
{
"categories":"adobe software",
"options":["adobe software issue","application adobe issue","fix my adobe application"],    
},
]
text="computer update"

What i managed to do is this bvut it didnt work as expected

for i in data:
  for j in i['options']:
    if text== j:
      print(i['categories'])

The above data variable include list of dic and im getting text and based on that text i try to best match text with most sutable category for example text ="update ope4rating system " then catgroies will be only operating system

for example text="fix adobe software" then catgory will be only adobe software

what im trying to achieve is to determine text will be related to which category . for example update computer will be operating system category and for example adobe software application issue related to adobe software

Thank you

CodePudding user response:

Change

if text == j:

to:

if text in j:

By using == you are requiring your search text to be an exact match of what's in the dictionary and none of your items match exactly but they do start with the search text.

EDIT: Or to find any of your search words in your dictionary strings use:

for i in data:
  for j in i['options']:
    if any(word in j for word in text.split(' ')):
      print(i['categories'])
      break

Or if you want all keywords to be specified change any to all.

CodePudding user response:

First of, what you are looking for is partial match, but your code performed exact match (with ==). Secondly, since you are mentioning any() function, my solution will include it:

for datum in data:
    if any("update computer" in option for option in datum["options"]):
        print(datum["categories"])

The output:

application update
operating system

Update

The original post was not clear that we need to match all the keywords. So here is my revised solution.

def options_match(options, keywords):
    """Return True if at least one option matches all the keywords."""
    keywords = keywords.split()    
    for option in options:
        if all(keyword in option for keyword in keywords):
            return True
    return False

for datum in data:
    if options_match(datum["options"], "computer update"):
        print(datum["categories"])

Second Update

Since the poster wants to locate just one category, not all, here is my code revision:

def options_match(options, keywords):
    """Return True if at least one option matches all the keywords."""
    keywords = keywords.split()    
    for option in options:
        if all(keyword in option for keyword in keywords):
            return True
    return False

category = "<NOT FOUND>"
for datum in data:
    if options_match(datum["options"], "computer update"):
        category = datum["categories"]
        break
print(category)
  • Related