I have a text like this
EXPRESS blood| muscle| testis| normal| tumor| fetus| adult
RESTR_EXPR soft tissue/muscle tissue tumor
Right now I want to only extract the last item in EXPRESS line, which is adult
.
My pattern is:
[|](.*?)\n
The code goes greedy to muscle| testis| normal| tumor| fetus| adult
. Can I know if there is any way to solve this issue?
CodePudding user response:
You can take the capture group value exclude matching pipe chars after matching a pipe char followed by optional spaces.
If there has to be a newline at the end of the string:
\|[^\S\n]*([^|\n]*)\n
Explanation
\|
Match|
[^\S\n]*
Match optional whitespace chars without newlines(
Capture group 1[^|\n]*
Match optional chars except for|
or a newline
)
Close group 1\n
Match a newline
Or asserting the end of the string:
\|[^\S\n]*([^|\n]*)$
CodePudding user response:
You could use this one. It spares you the space before, handle the \r\n case and is non-greedy:
\|\s*([^\|])*?\r?\n