Home > Software design >  How to compare two lists of strings whose elements patially match
How to compare two lists of strings whose elements patially match

Time:11-05

I have this two lists:

list_y= ['aaa/bbb/ccc/18_12_13_y_n', 'aaa/bbb/ccc/11_14_13_y_n', 'aaa/bbb/ccc/11_12_14_y_n', 'aaa/bbb/ccc/11_12_16_y_n', 'aaa/bbb/ccc/14_12_13_y_n', 'aaa/bbb/ccc/11_17_13_y_n', 'aaa/bbb/ccc/11_12_19_y_n', 'aaa/bbb/ccc/11_12_13_y_n', 'aaa/bbb/ccc/11_12_17_y_n', 'aaa/bbb/ccc/11_12_18_y_n', 'aaa/bbb/ccc/15_12_13_y_n', 'aaa/bbb/ccc/12_12_13_y_n', 'aaa/bbb/ccc/11_16_13_y_n', 'aaa/bbb/ccc/16_12_13_y_n', 'aaa/bbb/ccc/11_12_15_y_n', 'aaa/bbb/ccc/17_12_13_y_n', 'aaa/bbb/ccc/13_12_13_y_n', 'aaa/bbb/ccc/11_13_13_y_n', 'aaa/bbb/ccc/18_12_13_y_n', 'aaa/bbb/ccc/11_15_13_y_n']

list_x= ['aaa/bbb/ccc/11_12_13_x_n', 'aaa/bbb/ccc/11_13_13_x_n', 'aaa/bbb/ccc/11_14_13_x_n', 'aaa/bbb/ccc/11_17_13_x_n', 'aaa/bbb/ccc/14_12_13_x_n', 'aaa/bbb/ccc/11_12_19_x_n', 'aaa/bbb/ccc/12_12_13_x_n', 'aaa/bbb/ccc/11_12_14_x_n', 'aaa/bbb/ccc/11_16_13_x_n', 'aaa/bbb/ccc/11_12_18_x_n', 'aaa/bbb/ccc/17_12_13_x_n', 'aaa/bbb/ccc/11_12_15_x_n', 'aaa/bbb/ccc/11_12_17_x_n', 'aaa/bbb/ccc/11_15_13_x_n', 'aaa/bbb/ccc/11_12_16_x_n', 'aaa/bbb/ccc/16_12_13_x_n', 'aaa/bbb/ccc/18_12_13_x_n', 'aaa/bbb/ccc/15_12_13_x_n', 'aaa/bbb/ccc/13_12_13_x_n', 'aaa/bbb/ccc/18_12_13_x_n']

I want to compare if they have the same strings (except for the x or y letters) and then I want to know if each string in list_x is in the same position as its correspondent string in list_y

This is what I've tried:

list_a.sort()
list_b.sort()
list_a[0][0:21] == list_b[0][0:21]

This return True, because I'm comparing the first 22 elements of each string, and that's fine, the problem is that in this way I'm doing it only for the first string in the lists. How to do it for the whole lists?

To summarize my doubts:

  • How to compare two lists of strings that just partially match?
  • I use the list_a[0][0:21] == list_b[0][0:21] to compare the first 22 elements of the strings, but is there a way to 'exclude' the x and y and compare the whole string?

Thank you.

CodePudding user response:

Use zip-function and for-loop:

for a_string, b_string in zip(list_a, list_b):
    if a_string[0:21] == b_string[0:21]:
        print("a_string anf b_string are partially identical")

CodePudding user response:

This would test for both and print result of both condition

list_x.sort()
list_y.sort()

list_x_mod = []
list_y_mod = []

for i in list_x:
    list_x_mod.append(i[0:21])

for i in list_y:
    list_y_mod.append(i[0:21])

index = 0
while index < len(list_x_mod):
    if list_x_mod[index] in list_y_mod:
        print("Item", index, "in list_x exist in list_y")
    if list_x_mod[index] == list_y_mod[index]:
        print("Item", index, "is in same position on both list")
    index  = 1

This would test even if they are not in the same position.

CodePudding user response:

We assume that the length of the lists is the same (len(list_x)==len(list_y)) and all sort operation is done...

Variant 1 compare strings without 21-s symbol

length_x=len(list_x)
for i in range(0,length_x):
    if list_x[i][0:21] == list_y[i][0:21] and list_x[i][22:]==list_y[i][22:]:
        print 'Strings are equal at pos: %d' % i #countdown starts from 0

Variant 2 compare strings without x and y symbols. x and y removed by regexp in whole string

import re
length_x=len(list_x)

for i in range(0,length_x):
    if re.sub('[xy]','',list_x[i])==re.sub('[xy]','',list_y[i]):
        print 'Strings are equal at pos: %d' % i #countdown starts from 0

results

Strings are equal at pos: 4
Strings are equal at pos: 9
  • Related