Home > Enterprise >  Split string on Upper Case word
Split string on Upper Case word

Time:05-14

I have a string with 2 phrases, separated by an upper case word in the same string:

c="Text is here. TEST . More text here also"

I want to separate both phrases, removing the upper case word, TEST so that the output looks like:

["Text is here.","More text here also"]

What I did:

import re
c="Text is here. TEST . More text here also"
s=re.split('[A-Z][A-Z\d] ',c)
t=[re.sub('[^A-Za-z0-9]',' ',i) for i in s]

But I still get some unwanted spaces:

['Text is here  ', '   More text here also']

Is there a cleaner and pythonic way to generate t ?

CodePudding user response:

>>> re.split('\s*[A-Z]{2,}[\s\.]*', c)

['Text is here.', 'More text here also']

Spaces (optional) followed by at least two uppercase characters, followed by spaces or dots (optional).

CodePudding user response:

This works, but it isn't that elegant.

c="Text is here. TEST . More text here also"

In [20]: [i.strip().replace('. ','') for i in c.split('TEST')]

Out[20]: ['Text is here.', 'More text here also']
  • Related