Home > Net >  How to bring the closest value in a comparison in Python?
How to bring the closest value in a comparison in Python?

Time:12-28

I have the below code:

Cars = ["Toyota Supra","Toyota","Nissan","Honda Civic","BMW","Opel Corsa","Toyota Trueno"]

for item in Cars:
    if "Toyota" in item:
        print(item)

The output for that code shows:

Toyota Supra
Toyota
Toyota Trueno

I would like to know if there is way to return the more accurate value.

For example:

For Toyota 100% has to be the result

For Toyota Supra 50% has to be the result

For Toyota Trueno 50% has to be the result

Is there any library or way to see the percentage of equal value?

CodePudding user response:

There are many ways of comparing how similar two strings are. One such method is the Levenshtein distance, which measures how many single-character edits are needed to change one string into another. There is a Python library available for that: python-Levenshtein.

Another method is Ratcliff/Obershelp pattern recognition, which divides the number of matching characters by the total number of characters. An implemenation of that is included with Python:

from difflib import SequenceMatcher

SequenceMatcher(None, "Toyota", "Toyota Supra").ratio()
# returns 0.6666...

Using the latter, you can do for example:

sorted(Cars, key=lambda s: SequenceMatcher(None, s, "Toyota").ratio())
# last entry in list is the best match

CodePudding user response:

from difflib import SequenceMatcher

SequenceMatcher(None, "Toyota", "Toyota Supra").ratio()
# returns 0.6666...

CodePudding user response:

I am not sure if this is what you meant, or if this is the best way to do it, but I just calculated to how many percent the two strings match.

Code:

SEARCH_TERM = "Toyota"
CARS = ["Toyota Supra","Toyota","Nissan","Honda Civic","BMW","Opel Corsa","Toyota Trueno"]

for item in CARS:
    if SEARCH_TERM in item:
        not_matching_chars = len(item.replace(SEARCH_TERM, ""))
        all_chars = len(item)
        percent = 100 - ((not_matching_chars / all_chars) * 100)
        print(f"{item}: {percent}% matching")

Output:

Toyota Supra: 50.0% matching
Toyota: 100.0% matching
Toyota Trueno: 46.15384615384615% matching
  • Related