How to bring the closest value in a comparison in Python?-CodePudding

I have the below code:

Cars = ["Toyota Supra","Toyota","Nissan","Honda Civic","BMW","Opel Corsa","Toyota Trueno"]

for item in Cars:
    if "Toyota" in item:
        print(item)

The output for that code shows:

Toyota Supra
Toyota
Toyota Trueno

I would like to know if there is way to return the more accurate value.

For example:

For Toyota 100% has to be the result

For Toyota Supra 50% has to be the result

For Toyota Trueno 50% has to be the result

Is there any library or way to see the percentage of equal value?

CodePudding user response：

There are many ways of comparing how similar two strings are. One such method is the Levenshtein distance, which measures how many single-character edits are needed to change one string into another. There is a Python library available for that: python-Levenshtein.

Another method is Ratcliff/Obershelp pattern recognition, which divides the number of matching characters by the total number of characters. An implemenation of that is included with Python:

from difflib import SequenceMatcher

SequenceMatcher(None, "Toyota", "Toyota Supra").ratio()
# returns 0.6666...

Using the latter, you can do for example:

sorted(Cars, key=lambda s: SequenceMatcher(None, s, "Toyota").ratio())
# last entry in list is the best match

CodePudding user response：

from difflib import SequenceMatcher

SequenceMatcher(None, "Toyota", "Toyota Supra").ratio()
# returns 0.6666...

CodePudding user response：

I am not sure if this is what you meant, or if this is the best way to do it, but I just calculated to how many percent the two strings match.

Code:

SEARCH_TERM = "Toyota"
CARS = ["Toyota Supra","Toyota","Nissan","Honda Civic","BMW","Opel Corsa","Toyota Trueno"]

for item in CARS:
    if SEARCH_TERM in item:
        not_matching_chars = len(item.replace(SEARCH_TERM, ""))
        all_chars = len(item)
        percent = 100 - ((not_matching_chars / all_chars) * 100)
        print(f"{item}: {percent}% matching")

Output:

Toyota Supra: 50.0% matching
Toyota: 100.0% matching
Toyota Trueno: 46.15384615384615% matching