Home > Net >  Regex to split or to find the Dictionary 'Like' elements in text
Regex to split or to find the Dictionary 'Like' elements in text

Time:08-12

I have a string that is like

"Name: Abcde fghijk, College: so and so college, somewhere, on earth Department: I Dont Know, Designation: still to be decided"

and i need to output something like this.

[ 'Name: Abcde fghijk,' , 
'College: so and so college, somewhere, on earth' , 
'Department: I Dont Know,' , 
'Designation: still to be decided' ]

I,ve been trying to formulate somekind of regex to find or to split the elements in certain way like this

r"[^\s]*:.*?,"

which i could bring it to something like this

['Name: Abcde fghijk,','College: so and so college,','Department: I Dont Know,']

but it misses some part of it.

 "somewhere, on earth" and "Designation: still to be decided"

Can someone help out on this! I NEED SOMETHING LIKE capture until one word before next : or till the end

CodePudding user response:

Here is an re.findall approach which seems to be working:

inp = "Name: Abcde fghijk, College: so and so college, somewhere, on earth Department: I Dont Know, Designation: still to be decided"
matches = re.findall(r'\w : .*?\s*(?=\w :|$)', inp)
print(matches)

This prints:

['Name: Abcde fghijk, ',
 'College: so and so college, somewhere, on earth ',
 'Department: I Dont Know, ',
 'Designation: still to be decided']

Explanation of regex:

  • \w : match leading label followed by colon
  • .*? space followed by any content, up to, but not including
  • \s* optional whitespace
  • (?=\w :|$) assert that what follows is another label: or end of input
  • Related