I have the following string:
bar = 'F9B2Z1F8B30Z4'
I have a function foo
that splits the string on F
, then adds back the F
delimiter.
def foo(my_str):
res = ['F' elem for elem in my_str.split('F') if elem != '']
return res
This works unless there are two "F"s back-to-back in the string. For example,
foo('FF9B2Z1F8B30Z4')
returns
['F9B2Z1', 'F8B30Z4']
(the double "F" at the start of the string is not processed)
I'd like the function to split on the first "F" and add it to the list, as follows:
['F', 'F9B2Z1', 'F8B30Z4']
If there is a double "F" in the middle of the string, then the desired behavior would be:
foo('F9B2Z1FF8B30Z4')
['F9B2Z1', 'F', 'F8B30Z4']
Any help would be greatly appreciated.
CodePudding user response:
Instead of the filtering if
, use slicing instead because an empty string is a problem only at the beginning:
def foo(my_str):
res = ['F' elem for elem in my_str.split('F')][1:]
return res
Output:
>>> foo('FF9B2Z1F8B30Z4')
['F', 'F9B2Z1', 'F8B30Z4']
>>> foo('FF9B2Z1FF8B30Z4FF')
['F', 'F9B2Z1', 'F', 'F8B30Z4', 'F', 'F']
CodePudding user response:
Using regex it can be done with
import re
pattern = r'^[^F] |(?<=F)[^F]*'
The ^[^F]
captures all characters at the beginning of strings that do not start with F
.
(?<=F)[^F]*
captures anything following an F
so long as it is not an F
character including empty matches.
>>> print(['F' x for x in re.findall(pattern, 'abcFFFAFF')])
['Fabc', 'F', 'F', 'FA', 'F', 'F']
>>> print(['F' x for x in re.findall(pattern, 'FFabcFA')])
['F', 'Fabc', 'FA']
>>> print(['F' x for x in re.findall(pattern, 'abc')])
['Fabc']
Note that this returns nothing for empty strings. If empty strings need to return ['F']
then pattern can be changed to pattern = r'^[^F] |(?<=F)[^F]*|^$'
adding ^$
to capture empty strings.