Home > Net >  Programatically identify a subset of a string
Programatically identify a subset of a string

Time:10-15

I have some images in a folder that have a similar naming convention.

Example:

  • "Large_Blue_Ocean_Split_0_1.png"
  • "Large_Blue_Ocean_Split_0_2.png"
  • "Large_Blue_Ocean_Split_0_3.png"
  • "Large_Blue_Ocean_Split_1_1.png"
  • "Great_White_Shark_Split_0_1.png"
  • "Great_White_Shark_Split_0_2.png"
  • "Great_White_Shark_Split_0_3.png"

I loop through the folder for an image and I am trying to subset the string (i.e the image name) such that I'll have:

  • "Split_0_1.png"
  • "Split_0_2.png"
  • "Split_0_3.png"
  • "Split_1_1.png" if the image is "Large_Blue_Ocean" and then put it all in a list.

I tried doing this manually e.g "Large_Blue_Ocean_Split_0_1.png"[:-13] and it works, although I still think it'll be good practice for me to do this without using "magic numbers" (i.e 13). I included my code below:

from pathlib import Path

directory_in_str = "images/"
image_name = "Large_Blue_Ocean"
image_list = []
pathlist = Path(directory_in_str).glob(f'{image_name}*')
for path in pathlist:
    path_in_str = str(path)
    print(path_in_str)
    image_list.append(path_in_str[:-13])

Any help is much appreciated thank you!!

Also, the stem of the image name (i.e the part I'm interested in) always has either "Split..." or "split...". If that helps.

CodePudding user response:

Try:

image_list = [re.findall("(?:split|Split).*", str(path))[0] for path in  path_list]

Output:

['Split_0_1.png',
 'Split_0_2.png',
 'Split_0_3.png',
 'Split_1_1.png',
 'Split_0_1.png',
 'Split_0_2.png',
 'Split_0_3.png']
  • Related