Home > database >  Extract the string inside a list if it contains specific pattern
Extract the string inside a list if it contains specific pattern

Time:03-24

I have a list like this:

ques = ['Normal adult dogs have how many teeth?', 'What is the most common training command taught to dogs?', 'What is this?;{;i1.png;};', 'Which part of cats is as unique as human fingerprints?', 'What is a group of cats called??', 'What is this?;{;i2.png;};']

As we can see that at ques[2] and ques[5] there is a text at end that follows a specific pattern ;{;*;};

That is the name of a img file stored in the directory. I want to extract those file name i.e. a list that contains :

Img_name = ['i1.png','i2.png']

Also after doing this i want to update ques and remove the pattern and img filename from it.

CodePudding user response:

use regular expression;

import re

image_names = []
pattern = re.compile(r';{;([\w.] );};')
for idx, item in enumerate(ques):
    result = re.search(pattern, item)
    if result:
        image_names.append(result.group(1))
        ques[idx] = re.sub(pattern, '', item)
        
print(ques)
print(image_names)

CodePudding user response:

If you are using Python 3.8, you can make good use of the walrus operator for a concise expression:

pat = re.compile(r';{;(.*?);};')

img_names = [m.group(1) for s in ques if (m := re.search(pat, s))]
ques_clean = [re.sub(pat, '', s) for s in ques]

On your data:

>>> img_name
['i1.png', 'i2.png']

>>> ques_clean
['Normal adult dogs have how many teeth?',
 'What is the most common training command taught to dogs?',
 'What is this?',
 'Which part of cats is as unique as human fingerprints?',
 'What is a group of cats called??',
 'What is this?']

CodePudding user response:

Using regex substitution

import re

# File pattern
file_pattern = re.compile(r"\;{;\w \.\w ;\};")  # file name pattern

def replace(m):
    ' Function to update file and return empty stsring for pattern detected '
    files_found.append(m[0])
    return ""

ques = ['Normal adult dogs have how many teeth?', 'What is the most common training command taught to dogs?', 'What is this?;{;i1.png;};', 'Which part of cats is as unique as human fingerprints?', 'What is a group of cats called??', 'What is this?;{;i2.png;};']

# Initialize file list to empty list
files_found = []

# Use list comprehension to create list without files and update files found
new_ques = [file_pattern.sub(replace, q) for q in ques]

print(files_found)
print(new_ques)

Output

[';{;i1.png;};', ';{;i2.png;};']
['Normal adult dogs have how many teeth?', 'What is the most common training command taught to dogs?', 'What is this?', 'Which part of cats is as unique as human fingerprints?', 'What is a group of cats called??', 'What is this?']

  • Related