This question is the continuation of this post. I have the following list :
list_paths=[imgs/foldeer/img_ABC_21389_1.tif.tif,
imgs/foldeer/img_ABC_15431_10.tif.tif,
imgs/foldeer/img_GHC_561321_2.tif.tif,
imgs_foldeer/img_BCL_871125_21.tif.tif,
...]
I want to be able to run a for loop to match string with specific number,which is the number between the third occurrence of "_" to the ".tif.tif", for example, when number is 1, the string to be matched is "imgs/foldeer/img_ABC_21389_1.tif.tif" ,
for number 2, the match string will be "imgs/foldeer/img_GHC_561321_2.tif.tif".
For that, I wanted to use regex expression using list comprehension. Based on this answer, I have tested this regex expression on Regex101:
number = 10
pattern = rf"^\S*?/(?:[^\s_/] _){{3}}{number}\.tif\b[^\s/]*$"
indices = [for x in data if re.search(pattern, x)]
But this doesn't match anything, and also doesn't make sure that it will take the exact number, so if number is 1, it might also select items with number 10 .
My end goal is to be able to match items in the list that have the request number between the 2nd occurrence of "_" to the first occirance of ".tif" , using regex expression, looking for help with the regex expression.
The output should be the whole path and not only the number.
CodePudding user response:
You can simplify your existing regex pattern a bit to use the exact matching for the ending .tif.tif
import re
data=['imgs/foldeer/img_ABC_21389_1.tif.tif',
'imgs/foldeer/img_ABC_15431_10.tif.tif',
'imgs/foldeer/img_GHC_561321_2.tif.tif',
'imgs_foldeer/img_BCL_871125_21.tif.tif']
number = 2
pattern = rf"^\S*?/(?:[^\s_/] _){{3}}{number}\.tif\.tif$"
print([x for x in data if re.search(pattern, x)])
Output:
['imgs/foldeer/img_ABC_15431_2.tif.tif']
My end goal is to be able to match items in the list that have the request number between the 2nd occurrence of "_" to the first occirance of ".tif" , using regex expression, looking for help with the regex expression.
number = 1
pattern = rf"^\S*?/(?:[^\s_/] _){{3}}{number}\.tif\.tif$"
print([x for x in data if re.search(pattern, x)])
Output:
['imgs/foldeer/img_ABC_21389_1.tif.tif']
As you can see, when number is 1, only the pattern with 1 is matched(even though we have a pattern having 10 in the data) with output being - ['imgs/foldeer/img_ABC_21389_1.tif.tif']