I'm in need to make a regex to match some String formats, I'm not really good to Regex, that is why I'm asking for your help.
I receive a bank of Strings, they start with the following format:
Examples: "1ASF1-1-A42-A_4-214A-GarbageText", "DI-21f-112rf-A-214_124_12412A_GarbageText", "312c_12412_1241-12rf-001-GarbageText"
An AlfaNumeric that has from 1 - 20 characters ( - or _) repeated any times (can't know how many repeatings) Then it has some garbage text.
How can I make a regex to find if the String starts with the pattern I want? I think It would be something like:
[a-zA-Z0-9]{1,20}[_-]
CodePudding user response:
What've you've written means the string must end in one-to-many -
or _
characters, not have just one, then "garbage text".
You need to group the repeated pattern.
Then if you have "garbage text", then you can use .
([a-zA-Z0-9]{1,20}[_-]) .
But this will include spaces and any other symbols rather than just [a-zA-Z0-9]
... \w
might be "safer"
CodePudding user response:
If you want to match the whole string, you can start with the pattern 1-20 chars, and then optionally repeat -
or _
followed by again the pattern 1-20 chars.
[a-zA-Z0-9]{1,20}(?:[-_][a-zA-Z0-9]{1,20})*
If you want the match until the last occurrence of -
or _
when there are not consecutive dashes or underscores like --
:
[a-zA-Z0-9]{1,20}(?:[-_][a-zA-Z0-9]{1,20})*[_-]
If they can also be mixed:
[a-zA-Z0-9](?:[-_a-zA-Z0-9]*[-_])