Home > database >  Merge two lists based on common parts of a string [closed]
Merge two lists based on common parts of a string [closed]

Time:10-07

Suppose I have the following two lists:

list1=["Equipment ONLY - Bees Technologies","Bees Technologies","Chris Metal SA - Central Office","NSA Aerospace tech"]

list2=["Bees Tech, Inc.","Chris Metal, SA","NSA Arerospace"]

how can I merge the two lists to get the following:

final_list=["Equipment ONLY - Bees Technologies","Bees Technologies", "Bees Tech, Inc.", "Chris Metal SA - Central Office", "Chris Metal, SA","NSA Aerospace tech", "NSA Arerospace"]

CodePudding user response:

I think sorting the list is what you're looking for:

list1=["Equipment ONLY - Bees Technologies","Bees Technologies","Chris Metal SA - Central Office","NSA Aerospace tech"]
list2=["Bees Tech, Inc.","Chris Metal, SA","NSA Arerospace"]

list1.extend(list2)
list1.sort()

print(list1) # Result : ['Bees Tech, Inc.', 'Bees Technologies', 'Chris Metal SA - Central Office', 'Chris Metal, SA', 'Equipment ONLY - Bees Technologies', 'NSA Aerospace tech', 'NSA Arerospace']

EDIT

If you're looking only for merge them it can be done by :

list1.extend(list2)

or by :

list1 = list1   list2

CodePudding user response:

Just add the list & use sorted

list1=["Equipment ONLY - Bees Technologies","Bees Technologies","Chris Metal SA - Central Office","NSA Aerospace tech"]

list2=["Bees Tech, Inc.","Chris Metal, SA","NSA Arerospace"]

final_list = list1 list2

print(sorted(final_list))

Will give

['Bees Tech, Inc.', 'Bees Technologies', 'Chris Metal SA - Central Office', 'Chris Metal, SA', 'Equipment ONLY - Bees Technologies', 'NSA Aerospace tech', 'NSA Arerospace']

CodePudding user response:

try this:

final_list = sorted(list1   list2)

CodePudding user response:

Not entirely clear what you are looking for. If you want to sort "matching" items from list1 and list2 together, you could try to find the best match using difflib.SequenceMatcher. Assuming the (unique) elements in list2 are the prototypes you want to sort the elements from list1 to:

list1=["Equipment ONLY - Bees Technologies","Bees Technologies","Chris Metal SA - Central Office","NSA Aerospace tech"]
list2=["Bees Tech, Inc.","Chris Metal, SA","NSA Arerospace"]

import difflib
groups = {a: [] for a in list2}
for b in list1:
    a = max(list2, key=lambda a: difflib.SequenceMatcher(a=a, b=b).ratio())
    groups[a].append(b)
# {'Bees Tech, Inc.': ['Equipment ONLY - Bees Technologies', 'Bees Technologies'],
#  'Chris Metal, SA': ['Chris Metal SA - Central Office'],
#  'NSA Arerospace': ['NSA Aerospace tech']}

res = [x for a in groups for x in (*groups[a], a)]
# ['Equipment ONLY - Bees Technologies', 'Bees Technologies', 'Bees Tech, Inc.', 'Chris Metal SA - Central Office', 'Chris Metal, SA', 'NSA Aerospace tech', 'NSA Arerospace']
  • Related