I am trying to trace to what extent is listA, listB, listC... similar to the original list. How do I print the number of elements that occur in the same sequence
in listA
as they occur in the original list?
original_list = ['I', 'live', 'in', 'space', 'with', 'my', 'dog']
listA = ['my', 'name', 'my', 'dog', 'is', 'two', 'years', 'old']
listB = ['how', 'where', 'I', 'live', 'in', 'space', 'with']
listC = ['I', 'live', 'to', 'the' 'in', 'space', 'with', 'my', 'football', 'my','dog']
Output:
listA: Count = 2 #'my', 'dog'
listB: Count = 5 #'I', 'live', 'in', 'space', 'with'
listC: Count = 2,4,2 #'I', 'live'
#'in', 'space', 'with', 'my'
#'my', 'dog'
CodePudding user response:
I wrote a function that does the job I think. It might be a bit too complex, but I can't see an easier way at the moment:
original = ['I', 'live', 'in', 'space', 'with', 'my', 'dog']
listA = ['my', 'name', 'my', 'dog', 'is', 'two', 'years', 'old']
listB = ['how', 'where', 'I', 'live', 'in', 'space', 'with']
listC = ['I', 'live', 'to', 'the', 'in', 'space', 'with', 'my', 'football', 'my', 'dog']
def get_sequence_lengths(original_list, comparative_list):
original_options = []
for i in range(len(original_list)):
for j in range(i 1, len(original_list)):
original_options.append(original_list[i:j 1])
comparative_options = []
for i in range(len(comparative_list)):
for j in range(i 1, len(comparative_list)):
comparative_options.append(comparative_list[i:j 1])
comparative_options.sort(key=len, reverse=True)
matches = []
while comparative_options:
for option in comparative_options:
if option in original_options:
matches.append(option)
new_comparative_options = comparative_options.copy()
for l in comparative_options:
counter = 0
for v in option:
counter = counter 1 if v in l else 0
if counter == len(l):
new_comparative_options.remove(l)
break
comparative_options = new_comparative_options
break
if option == comparative_options[-1]:
break
matches = [option for option in original_options if option in matches]
lengths = [len(option) for option in matches]
print(lengths)
print(matches)
return lengths
If you call it with the original list and example lists, it prints the following.
get_sequence_lengths(original, listA)
prints [2] [['my', 'dog']]
.
get_sequence_lengths(original, listB)
prints [5] [['I', 'live', 'in', 'space', 'with']]
.
get_sequence_lengths(original, listC)
prints [2, 4, 2] [['I', 'live'], ['in', 'space', 'with', 'my'], ['my', 'dog']]
.
CodePudding user response:
Off the top of my head, I thought of using the in
keyword.
Essentially you could convert your original list into a string list_str = ''.join(original_list)
Then iterate over the other lists and by combining items from each list, such as items = listA[2] listA[3] # items = "mydog"
, you could then see if they match matched = items in list_str
.
I am unaware of other solutions but I'm sure given enough time you would be able to find something better, or someone else would know of something.