I have a list like this:
ques = ['Normal adult dogs have how many teeth?', 'What is the most common training command taught to dogs?', 'What is this?;{;i1.png;};', 'Which part of cats is as unique as human fingerprints?', 'What is a group of cats called??', 'What is this?;{;i2.png;};']
As we can see that at ques[2] and ques[5] there is a text at end that follows a specific pattern ;{;*;};
That is the name of a img file stored in the directory. I want to extract those file name i.e. a list that contains :
Img_name = ['i1.png','i2.png']
Also after doing this i want to update ques and remove the pattern and img filename from it.
CodePudding user response:
use regular expression;
import re
image_names = []
pattern = re.compile(r';{;([\w.] );};')
for idx, item in enumerate(ques):
result = re.search(pattern, item)
if result:
image_names.append(result.group(1))
ques[idx] = re.sub(pattern, '', item)
print(ques)
print(image_names)
CodePudding user response:
If you are using Python 3.8, you can make good use of the walrus operator for a concise expression:
pat = re.compile(r';{;(.*?);};')
img_names = [m.group(1) for s in ques if (m := re.search(pat, s))]
ques_clean = [re.sub(pat, '', s) for s in ques]
On your data:
>>> img_name
['i1.png', 'i2.png']
>>> ques_clean
['Normal adult dogs have how many teeth?',
'What is the most common training command taught to dogs?',
'What is this?',
'Which part of cats is as unique as human fingerprints?',
'What is a group of cats called??',
'What is this?']
CodePudding user response:
Using regex substitution
import re
# File pattern
file_pattern = re.compile(r"\;{;\w \.\w ;\};") # file name pattern
def replace(m):
' Function to update file and return empty stsring for pattern detected '
files_found.append(m[0])
return ""
ques = ['Normal adult dogs have how many teeth?', 'What is the most common training command taught to dogs?', 'What is this?;{;i1.png;};', 'Which part of cats is as unique as human fingerprints?', 'What is a group of cats called??', 'What is this?;{;i2.png;};']
# Initialize file list to empty list
files_found = []
# Use list comprehension to create list without files and update files found
new_ques = [file_pattern.sub(replace, q) for q in ques]
print(files_found)
print(new_ques)
Output
[';{;i1.png;};', ';{;i2.png;};']
['Normal adult dogs have how many teeth?', 'What is the most common training command taught to dogs?', 'What is this?', 'Which part of cats is as unique as human fingerprints?', 'What is a group of cats called??', 'What is this?']