Home > front end >  Python string split in a specific pattern
Python string split in a specific pattern

Time:07-05

I am trying to split a string in this specific pattern:

'ff19shh24c' -> ['f', 'f', '19s', 'h', 'h', '24c']

I managed to get this close:

import re

string = "ff19shh24c"

parts = re.findall(r'\D |\d [a-z]{1}')

print(parts) -> ['ff', '19s', 'hh', '24c']

But now I am a little bit stuck.

CodePudding user response:

Search for anything (non-greedy) and then a letter.

import re

string = "ff19shh24c"
parts = re.findall(r'.*?[a-z]', string)
print(parts) 

This will give you ['f', 'f', '19s', 'h', 'h', '24c']

CodePudding user response:

One possibility, find zero or more digits, then a non-digit:

import re

string = 'ff19shh24c'
parts = re.findall('\d*\D', string)

output: ['f', 'f', '19s', 'h', 'h', '24c']

CodePudding user response:

Since question not tagged with regex or similar here a for loop approach

s = 'ff19shh24c'

out = []
tmp = ''
was_a_digit = False # keep track if the previous character was a digit
for char in s:
    if char.isdigit():
        was_a_digit = True
        tmp  = char
    else:
        if was_a_digit:
            tmp  = char
            out.append(tmp)
            tmp = ''
            was_a_digit = False
        else:
            out.append(char)

print(out)
#['f', 'f', '19s', 'h', 'h', '24c']

In case of strings which end with digits the above code will loose these characters but with a slight edit one can still retrieve them.

Here the approach with conservation of characters:

s = 'ff19shh24cX29ZZ88'

... same as above

# directly after the end of the for loop
out.append(tmp)

print(out)
['f', 'f', '19s', 'h', 'h', '24c', 'X', '29Z', 'Z', '88']
  • Related