Home > OS >  Splitting a list of strings based on substring with variable character
Splitting a list of strings based on substring with variable character

Time:09-19

I have the following list of strings:

my_list = ['2022-09-18 1234 name O0A raw.txt',
'2022-09-18 1234 name O0P raw.txt',
'2022-09-18 1234 name O1A raw.txt',
'2022-09-18 1234 name O1P raw.txt',
'2022-09-18 1234 name O2A raw.txt',
'2022-09-18 1234 name O2P raw.txt',
'2022-09-18 1234 name O3A raw.txt',
'2022-09-18 1234 name O3P raw.txt',
'2022-09-18 1234 name O4A raw.txt',
'2022-09-18 1234 name O4P raw.txt',
'2022-09-18 1234 name O5A raw.txt',
'2022-09-18 1234 name O5P raw.txt',
'2022-09-18 1234 name M0A raw.txt',
'2022-09-18 1234 name M0P raw.txt',
...
'2022-09-18 1234 name M5P raw.txt']

I want to split this into a new list containing let's say all "O?A", so

my_list_split = ['2022-09-18 1234 name O0A raw.txt',
'2022-09-18 1234 name O1A raw.txt',
'2022-09-18 1234 name O2A raw.txt',
'2022-09-18 1234 name O3A raw.txt',
'2022-09-18 1234 name O4A raw.txt',
'2022-09-18 1234 name O5A raw.txt',]

Based on previous posts on string list substring splitting, it seems the fastest way to do this is by

[s for s in my_list if ' O?A raw' in s]

but this returns an empty string. I guess there is some syntax that I am missing?

Thank you.

CodePudding user response:

It seems like what you're trying to do is a regular expression to match ' O?A raw', where '?' is any character. Here's what you want to do:

import re

# ... the lists ...

lst = [s for s in my_list if re.search(".  O.P raw. ", s)]
print(lst)
  • Related