Home > Enterprise >  Compare all the lists contained in a nested list to get only the strings that match
Compare all the lists contained in a nested list to get only the strings that match

Time:10-21

I want to compare all the lists contained in a nested list to have at the end the pair word who corresponding.

I manage this with 2 different lists to get the strings that match in each list.

In this way :

listA = [['Test1','Test2','Test3'], ['Test1','Test4','Test2']]
listB = [['Test1','Test2','Test5'], ['Test10','Test4','Test2']]

The result I obtain :

['Test1', 'Test2'] # Matched at [('Test1,'Test2'),'Test3'] -> [('Test1','Test2'),'Test5']

['Test2'] # Matched at ['Test1','Test4',('Test2')] -> ['Test1',('Test2'),'Test5']

['Test4', 'Test2'] # Matched at ['Test1',('Test4','Test2)] -> ['Test10',('Test4','Test2')]

We notice in this example that 'Test3, Test5 and Test10' are not in the result because none match with the other lists.

I would like to do it with a single nested list.

list = [['Test1','Test2','Test3'], ['Test1','Test4','Test2'], ['Test1','Test2','Test5'], ['Test5','Test4','Test2']]

Here the code I use with two list :

from collections import Counter
from itertools import takewhile, dropwhile
    
for x in listB:
    for y in listA:

        counterA = Counter(x)
        counterB = Counter(y)
        
        count = counterA   counterB

        count = dict(count)

        prompt_match = [k for k in count if count[k] == 2]

        print(prompt_match)

The code is not perfect with the 2 lists because I get duplicates.

CodePudding user response:

You could try set intersection in list comprehension

from itertools import product
listA = [['Test1','Test2','Test3'], ['Test1','Test4','Test2']]
listB = [['Test1','Test2','Test5'], ['Test10','Test4','Test2']]
set(tuple(set(x)&set(y)) for x,y  in product(listA, listB))

# output
{('Test1', 'Test2'), ('Test2',), ('Test4', 'Test2')}

For nested list use below approach
from itertools import combinations
listA = [['Test1','Test2','Test3'], ['Test1','Test4','Test2'], ['Test1','Test2','Test5'], ['Test10','Test4','Test2']]
set(tuple(set(x)&set(y)) for x,y  in combinations(listA, 2))
# output
{('Test1', 'Test2'), ('Test2',), ('Test4', 'Test2')}
  • Related