I need to extract from a string a set of characters (TEXT1 / TEXT2 / TEXT3) which are included between two delimiters, without returning the delimiters themselves, nor the preceding or ending spaces.
My strings look like this:
- TEXT1 | characters
- digits.digits - TEXT2 | characters
- characters - digits digits-digits-digits - TEXT3 | characters
- characters - digits digits/digits/digits - TEXT4 | characters
Is it possible to write a **single **regex that would work to extract TEXT1 / TEXT2 / TEXT3 / TEXT4 for all strings above?
If not, how could I extract for each case?
I tried:
- (.*?)(?=|) - but I don't know how to leave out the space after TEXT1.
- (?<=-)(.*?)(?=|) - but I don't know how to leave out the space before and after TEXT2
CodePudding user response:
[^\d-]*(\w )\s?(?=\|)
[^\d-]*
match all but digits and "-"
(\w )
match words
\s?
match whitespace or not
(?=\|)
lookahead "|"
you can also tinker it here
CodePudding user response:
You can use (?:-\h )?(\w )\h \|
.