I am trying to capture 1-2 groups in a line. If a line has a dash I want a group for before and a group for after the dash. If it does not then I would like 1 group of everything.
However, occasionally a line will start with 'Remove - ', which is a phrase I would like to ignore.
Example data:
| Strings |
| -------- |
| Remove - Precision Speed - Recap |
| Precision Speed - Recap |
| Remove - Precision Speed |
| Precision Speed |
The first two should each capture group 1: 'Precision Speed' AND group 2: 'Recap'. While the last two should only capture 1 group: 'Precision Speed'.
Right now I have ^(?:Remove - )?(. )(?:\s*-\s*)(.*)
and it is working correctly for the first two (because there is a second dash in there I believe). For the 3rd one it is capturing 'Remove' and 'Precision Speed' and for the 4th one it isn't capturing anything.
CodePudding user response:
You may use the following pattern:
^(?:Remove - )?([^-] )(?: - ([^-] ))?$
And if you're dealing with a multiline text, simply add \r\n
to the negated character class to avoid matches across multiple lines:
^(?:Remove - )?([^-\r\n] )(?: - ([^-\r\n] ))?$
Demo.
CodePudding user response:
Make the second -
and surrounding whitespace optional.
^(?:Remove - )?([^-] )(?:\s*-\s*)?(.*)