Consider below data
diction = {'A_B_D_E_F':0,
'B_C_E':0,
'A_D_E':0}
string = 'A_E_B_F'
I want to increase the count of key in 'diction' where the 'string' is matching in maximum percentage. In this case the count of 'A_B_D_E_F' should be increased to one. Here if we ignore 'E' from string it will majorly match with 'A_B_D_E_F' Note: The order of string should be main factor
I looked at String similarity in Python but I am not sure if those consider the order of string content.
New example:
diction = {'HEADER_Switchprofileoptionclicked__HEADER_profilechangedto:PROD_CE_VIEWER__HEADER_profilechangedto:PROD_CE_ADMIN__HEADER_Switchprofilesubmitbuttonclicked__CDPR_PageLoad__HEADER_Navigatedto:ShortfallAutomationDashboard': 0,
'HEADER_Switchprofileoptionclicked__HEADER_profilechangedto:PROD_CE_VIEWER__HEADER_profilechangedto:PROD_CE_ADMIN__HEADER_Switchprofilepop-upclosed__CDPR_PageLoad__HEADER_Navigatedto:ShortfallAutomationDashboard': 0,
'HEADER_Switchprofileoptionclicked__HEADER_profilechangedto:PROD_CE_ADMIN__HEADER_Switchprofilesubmitbuttonclicked__CDPR_PageLoad__HEADER_Navigatedto:ShortfallAutomationDashboard': 0,
'HEADER_Switchprofileoptionclicked__HEADER_profilechangedto:PROD_CE_ADMIN__HEADER_Switchprofilepop-upclosed__CDPR_PageLoad__HEADER_Navigatedto:ShortfallAutomationDashboard': 0}
string = 'HEADER_Switchprofileoptionclicked___HEADER_profilechangedto:PROD_CE_VIEWER___HEADER_profilechangedto:PROD_CE_ADMIN___HEADER_Switchprofilesubmitbuttonclicked___HEADER_Switchprofilepop-upclosed___CDPR_PageLoad___HEADER_Switchprofileoptionclicked___HEADER_profilechangedto:PROD_CE_ADMIN___HEADER_Switchprofilepop-upclosed___HEADER_Switchprofilesubmitbuttonclicked___CDPR_PageLoad___HEADER_Navigatedto:ShortfallAutomationDashboard'
CodePudding user response:
Try a dictionary comprehension:
>>> {k: (v 1 if all(x in k.split('_') for x in string.split('_')) else v) for k, v in diction.items()}
{'A_B_D_E_F': 1, 'B_C_E': 0, 'A_D_E': 0}
>>>
Or better assign first:
>>> lst = string.split('_')
>>> {k: (v 1 if all(x in k.split('_') for x in lst) else v) for k, v in diction.items()}
{'A_B_D_E_F': 1, 'B_C_E': 0, 'A_D_E': 0}
>>>
How this works is that it adds 1 to the value if the all of the characters in string
: A
, E
, B
and F
are in the key name, if so, it adds 1 to the value, if not it keeps it 0.
CodePudding user response:
You can use the difflib to get the closest match
import difflib
diction = { 'A_B_D_E_F':0,
'B_C_E':0,
'A_D_E':0
}
string = 'A_E_B_F'
diction[difflib.get_close_matches(string, diction.keys())[0]] = 1
print(diction)
Gives output
{'A_B_D_E_F': 1, 'B_C_E': 0, 'A_D_E': 0}
CodePudding user response:
As you said you should use levenshtein_distance
and yes it consider the order of strings.
You can do this:
import jellyfish
min_dist = len(string)
closest_string = ""
for s in diction.keys():
dist = jellyfish.levenshtein_distance(string, s)
if dist < min_dist:
min_dist = dist
closest_string = s
diction[closest_string] = 1