Home > Mobile >  How to split a string by character sets that are different in python
How to split a string by character sets that are different in python

Time:06-10

I want to split an string I have by characters that are different that the others into a list. for example, if I have string ccaaawq, I want my program to give me ['cc', 'aaa', 'w', 'q']. Since there is no single differentiator between each split, I'm wondering what is the best approach to solving this problem. thanks in advance for your answers

CodePudding user response:

You can use itertools.groupby:

from itertools import groupby

s = "ccaaawq"

out = ["".join(g) for _, g in groupby(s)]
print(out)

Prints:

['cc', 'aaa', 'w', 'q']

CodePudding user response:

Here is a regex find all approach:

inp = "ccaaawq"
output = [x[0] for x in re.findall(r'((.)\2*)', inp)]
print(output)  # ['cc', 'aaa', 'w', 'q']

The above works by matching any one character followed by that same character zero or more times. These matches are then stored in the first capture group, which we extract from the 2D list output.

  • Related