I have the below code:
Cars = ["Toyota Supra","Toyota","Nissan","Honda Civic","BMW","Opel Corsa","Toyota Trueno"]
for item in Cars:
if "Toyota" in item:
print(item)
The output for that code shows:
Toyota Supra
Toyota
Toyota Trueno
I would like to know if there is way to return the more accurate value.
For example:
For Toyota 100% has to be the result
For Toyota Supra 50% has to be the result
For Toyota Trueno 50% has to be the result
Is there any library or way to see the percentage of equal value?
CodePudding user response:
There are many ways of comparing how similar two strings are. One such method is the Levenshtein distance, which measures how many single-character edits are needed to change one string into another. There is a Python library available for that: python-Levenshtein.
Another method is Ratcliff/Obershelp pattern recognition, which divides the number of matching characters by the total number of characters. An implemenation of that is included with Python:
from difflib import SequenceMatcher
SequenceMatcher(None, "Toyota", "Toyota Supra").ratio()
# returns 0.6666...
Using the latter, you can do for example:
sorted(Cars, key=lambda s: SequenceMatcher(None, s, "Toyota").ratio())
# last entry in list is the best match
CodePudding user response:
from difflib import SequenceMatcher
SequenceMatcher(None, "Toyota", "Toyota Supra").ratio()
# returns 0.6666...
CodePudding user response:
I am not sure if this is what you meant, or if this is the best way to do it, but I just calculated to how many percent the two strings match.
Code:
SEARCH_TERM = "Toyota"
CARS = ["Toyota Supra","Toyota","Nissan","Honda Civic","BMW","Opel Corsa","Toyota Trueno"]
for item in CARS:
if SEARCH_TERM in item:
not_matching_chars = len(item.replace(SEARCH_TERM, ""))
all_chars = len(item)
percent = 100 - ((not_matching_chars / all_chars) * 100)
print(f"{item}: {percent}% matching")
Output:
Toyota Supra: 50.0% matching
Toyota: 100.0% matching
Toyota Trueno: 46.15384615384615% matching