Lets say I have messy data pattern:
- TRND-0LL2134.SV
- TRN_RTXDFT.SV
- TRND_ZRSFTFF.SV
- DR3345.SV
I am trying to use regex to filter on prefixes with "-" or "_", and those without.
Desired output
column1 column2 column3
=========== =========== ===========
TRND - 0LL2134.SV
TRN _ RTXDFT.SV
TRND _ ZRSFTFF.SV
<blank> <blank> DR3345.SV
So far I used
\-|\_
to filter on prefixes with "-" or "" but have a challenge filtering on strings that do not have a prefix with either "-" or ""
CodePudding user response:
Use [-_]?
to match the optional separator.
Use a non-greedy quantifier for the first word so that if there's no separator, it will be blank and the word will be associated with column 3.
^([A-Z0-9]*?)([-_]?)([A-Z0-9] \.SV)$
CodePudding user response:
Use a ?
to make a group optional
^(([A-Z] )[-_])?(\w .SV)$
If you want to match exactly the characters sequence TRN(D)
than:
^((TRND?)[-_])?(\w .SV)$
CodePudding user response:
Your matching expression should contain 3 groups for data to be fed into each of the columns of your output and be anchored at the beginning of your test strings.
Specifically, try ...
/^((.*)([_-])(.*))|^([^_-] )$/
Capture groups #2,3,4 will contain the data to go into your columns 1,2,3, resp.; in case there is none of the separating characters in the test string, the content of column 3 will be in capture group #5
The way you've described your problem, this pattern will always match - the only difference is in how the test string is distributed into the column slots. If valid test strings need to end on .SV
, use
/^((.*)([_-])(.*[.]SV))$|^([^_-] [.]SV)$/
As you haven't indicated in which programming environment you are operating this answer can not specifiy how to employ the regex.
Online examples are available on Regex101.com for Variant 1 and Variant 2