Home > OS >  find matching substring in a list with multiple condition
find matching substring in a list with multiple condition

Time:12-28

I have a multiple lists.

item1 = ["4bff652c-a589-4cb0-b28f-0745e199ae88-ppp.json",
"40e10f09-9d53-4891-a4d4-d2885e5492af-vvv.json",
"065aa522-a458-44d6-9894-7e928d422c35-a.json",
"5ba3fcb2-8fae-4847-a631-9d57acb6885c.json"]


item2 = ["fa28f1ba-5532-4ff8-945d-70f5b57a7733-ppp.json",
"ee65f5b5-1333-47f3-8eca-49b63fa35a62-a.json",
"9bc518d8-84b4-4032-9ed8-4bb78559a9a0.json",
"a20bc0c3-ff61-4df5-90c5-695c7614222e-b.json"]

item3 = ["6e1cb404-9494-4e2d-a4c7-16c62bf440ce-vvv.json",
"a3b3e94c-fe69-4304-8129-2137a6407479-a.json"]

I want to find if any of the above list has item ending with both -ppp.json and -vvv.json.

from the above example, the correct answer is item1.

i have tried with

for i in item1:
  if i.endswith("-ppp.json") and i.endswith("-vvv.json"):
     print(i)
     #do some opertion

But the above code is not working.

thanks in advance

CodePudding user response:

This is a straightforward solution. At first, you check if there are vvv and ppp items. Then, if they are, you do your operation

item1 = ["4bff652c-a589-4cb0-b28f-0745e199ae88-ppp.json",
"40e10f09-9d53-4891-a4d4-d2885e5492af-vvv.json",
"065aa522-a458-44d6-9894-7e928d422c35-a.json",
"5ba3fcb2-8fae-4847-a631-9d57acb6885c.json"]


item1_contains_ppp = False
item1_contains_vvv = False

for i in item1:
    if i.endswith("-ppp.json"):
        item1_contains_ppp = True
        
    if i.endswith("-vvv.json"):
        item1_contains_vvv = True
        
if item1_contains_vvv and item1_contains_ppp:
    ...
    #do some operation

CodePudding user response:

For each list, you may try:

ppp = 0
vvv = 0

for item in lst:
    if re.search(r'-ppp\.json$', item):
        ppp = 1
    if re.search(r'-vvv\.json$', item):
        vvv = 1

if ppp   vvv == 2:
    print("List matches")
else:
    print("List does not match")

This approach uses two separate variables to keep track of whether a list entry has been seen ending in either -ppp.json or -vvv.json. If both have been seen, the list is reported as a pass, otherwise it is reported as a failure.

CodePudding user response:

An approach like the following would work:

def check_endings(items):
    return all(any(l) for l in zip(*[(bool(re.match(".*-ppp.json$", i)), bool(re.match(".*-vvv.json$", i))) for i in items]))

print(check_endings(item1))
print(check_endings(item2))
print(check_endings(item3))

OUTPUT

True
False
False

In essence:

  • you go through each element of the list with a list comprehension
  • A list of tuples, formed by two booleans, is created. For example, for item1 you get
[(True, False), (False, True), (False, False), (False, False)]

given that the first element ends with -ppp.json and the second with -vvv.json.

  • Using zip(* you transpose the list of tuples, getting
[(True, False, False, False), (False, True, False, False)]

each vector tells you if you have for -ppp.json or -vvv.json, respectively.

  • With any you check if there is at least one True in each list; finally, with all you verify that both lists have True values

It has to be noticed that the bool cast in the list comprehension is not necessary (a Match is a true-ish element), but I used it to facilitate the explanation:

def check_endings(items):
    return all(any(l) for l in zip(*[(re.match(".*-ppp.json$", i), re.match(".*-vvv.json$", i)) for i in items]))

In addition, it has to be stressed that you may not need to check all the elements to conclude that a specific list satisfies your conditions: for example, you know that list item1 is good just after 2 elements. In that case:

def check_endings(items):
    res = [False, False]
    for item in items:
        if re.match(".*-ppp.json$", item):
            res[0] = True
        elif  re.match(".*-vvv.json$", item):
            res[1] = True
        if all(res):
            return True
    return False

This may make a difference for very big lists - note that if all(res) is inside the loop

  • Related