All help is greatly appreciated folks I found a brilliant solution here How to do CamelCase split in python ,
re.sub('([A-Z][a-z] )', r' \1', re.sub('([A-Z] )', r' \1', string)).split()
However I need it to stop IF THERE is SPACE Example
t = 'YankeesMets'
>>>['Yankees', 'Mets']
tt = 'CubsWhite Sox'
>>>['Cubs', 'White']
(no more words after the first whitespace)
So, how do I change regex to STOP splitting CamelCase if it finds space?
CodePudding user response:
You can get the part of the string from its beginning to the first whitespace and apply your solution to that part of the string:
re.sub('([A-Z][a-z] )', r' \1', re.sub('([A-Z] )', r' \1', text.split()[0])).split()
See the Python demo, and the following demo below:
import re
l = ['CubsWhite Sox', 'YankeesMets']
for s in l:
print(f"Processing {s}")
first = s.split()[0]
result = re.sub('([A-Z][a-z] )', r' \1', re.sub('([A-Z] )', r' \1', first)).split()
print(result)
Output:
Processing CubsWhite Sox
['Cubs', 'White']
Processing YankeesMets
['Yankees', 'Mets']