I would like to extract substrings that start after a number and before a double underscore. Below you see two test strings and the expected output after ->
.
'1234514 TEST STRING__blabla3452b' -> 'TEST STRING'
'16275653 TEST_STRING__bl67abla3452b' -> 'TEST_STRING'
The regex I came up with so far: (?:^|\n)\d ([^__] )
only returns the first but not the second output as the second underscore is not recognised. I tried to escape the underscores but that did not work. Any help would be very much apprecciated.
Thanks.
CodePudding user response:
You can use
^\d \s*(.*?)(?=__)
\d\s (.*?)(?=__)
See the regex demo #1 and regex demo #2.
Details:
^
- start of string\d
- one or more digits\s*
- zero or more whitespaces(.*?)
- Group 1: any zero or more chars other than line break chars as few as possible(?=__)
- a location immediately followed with__
.