Home > front end >  Check if list entry exists in other list
Check if list entry exists in other list

Time:01-25

I want to check if a list entry from one list exists in another list of strings. After this I want to delete the match from the string. See an example here:

list1 = ["MyFirm Geographical Enablement Framework", "MyFirm Multiresource Scheduling",
         "MyFirm SuccessFactors Recruiting"]
list2 = ["Install MyFirm SuccessFactors Recruiting in the right way",
         "Get MyFirm Geographical Enablement Framework to work"]

Out of this I want to create a new list with all words from list 2, except the words which are also in list 1. So see here an example output:

new_list = ["Install in the right way", "Get to work"]

My first approach would be like this, but it doesn't work:

new_list = []

for element in list1:
    if element.split() in list2:
        result = list2.remove(element)
        new_list.append(result)

Does anyone have a solution to this?

CodePudding user response:

Create a set of words to exclude from the concatenation of all words in list1. Then filter words of sentences in list2 using that set:

excluded = set(" ".join(list1).split())
new_list = [" ".join(word for word in sentence.split() if word not in excluded)
                for sentence in list2]

print(new_list)  # -> ['Install in the right way', 'Get to work']

CodePudding user response:

split each string to get the list of words, and then re-join the strings in list2 while filtering out those that are in the combined word list from list1:

>>> [' '.join(w for w in s.split() if w not in {w for s in list1 for w in s.split()}) for s in list2]
['Install in the right way', 'Get to work']

CodePudding user response:

You need to check for each item (say i) of list2 if any item (say j) of list1 is inside that item (i). If inside, find the index where it starts, add the length to get the ending point, append to the new_list and then break out of inner loop.

This is what you need:

new_list = []
for i in list2:
    for j in list1:
        if j in i:
            new_list.append(i[:i.index(j)]   i[i.index(j)   len(j):])
            break

print(new_list)

# ['Install  in the right way', 'Get  to work']

Notice that extra space in between? For that you might have to strip of spaces and add one extra space while concatenating to get you desired result. For that change line6 to this :

new_list.append(i[:i.index(j)].rstrip()   " "   i[i.index(j)   len(j):].lstrip())

CodePudding user response:

Another way :

new_list = []
for it in list2:
    for r_st in list1:
        it = it.replace(r_st, '').strip()
    new_list.append(it.replace('  ', ' '))

print(new_list)

>>>
new_list=['Install in the right way', 'Get to work']

CodePudding user response:

I would first create a lookup list from first list, and then iterate over the words in second list to eliminate any words that occur in the first list.

lookup = ' '.join(list1).split()
new_list = []
for sentence in list2:
    sentence = ' '.join([word for word in sentence.split() if word not in lookup])
    new_list.append(sentence)

print(new_list) 
# ['Install in the right way', 'Get to work']

CodePudding user response:

I think using sets is the great option: (only if you are okay with the unordered results)

list1 = ["MyFirm Geographical Enablement Framework", "MyFirm Multiresource Scheduling", "MyFirm SuccessFactors Recruiting"]
list2 = ["Install MyFirm SuccessFactors Recruiting in the right way", "Get MyFirm Geographical Enablement Framework to work"]

res = []

full_list1 = ' '.join(list1) # String
full_set1 = set(full_list1.split(' '))

for elem2 in list2:
    set2 = set(elem2.split(' '))
    tmp_diff = set2.difference(full_set1)

    res.append(' '.join(list(tmp_diff)))

res will be your desired result (list).

CodePudding user response:

I'll round the answers a with a regex solution: make a pattern of the restricted phrases by concatenating them with '|'; substitute '' for the restricted phrases in all the strings.

import re
a = ["MyFirm Geographical Enablement Framework",
     "MyFirm Multiresource Scheduling", "MyFirm SuccessFactors Recruiting"]
b = ["Install MyFirm SuccessFactors Recruiting in the right way",
     "Get MyFirm Geographical Enablement Framework to work"]

pattern = '|'.join('({})'.format(s) for s in a)
#print(pattern)
q = [re.sub(pattern,'',s) for s in b]
#print(q)
  •  Tags:  
  • Related