I am trying to split a string in this specific pattern:
'ff19shh24c' -> ['f', 'f', '19s', 'h', 'h', '24c']
I managed to get this close:
import re
string = "ff19shh24c"
parts = re.findall(r'\D |\d [a-z]{1}')
print(parts) -> ['ff', '19s', 'hh', '24c']
But now I am a little bit stuck.
CodePudding user response:
Search for anything (non-greedy) and then a letter.
import re
string = "ff19shh24c"
parts = re.findall(r'.*?[a-z]', string)
print(parts)
This will give you ['f', 'f', '19s', 'h', 'h', '24c']
CodePudding user response:
One possibility, find zero or more digits, then a non-digit:
import re
string = 'ff19shh24c'
parts = re.findall('\d*\D', string)
output: ['f', 'f', '19s', 'h', 'h', '24c']
CodePudding user response:
Since question not tagged with regex
or similar here a for loop approach
s = 'ff19shh24c'
out = []
tmp = ''
was_a_digit = False # keep track if the previous character was a digit
for char in s:
if char.isdigit():
was_a_digit = True
tmp = char
else:
if was_a_digit:
tmp = char
out.append(tmp)
tmp = ''
was_a_digit = False
else:
out.append(char)
print(out)
#['f', 'f', '19s', 'h', 'h', '24c']
In case of strings which end with digits the above code will loose these characters but with a slight edit one can still retrieve them.
Here the approach with conservation of characters:
s = 'ff19shh24cX29ZZ88'
... same as above
# directly after the end of the for loop
out.append(tmp)
print(out)
['f', 'f', '19s', 'h', 'h', '24c', 'X', '29Z', 'Z', '88']