Home > Mobile >  Repeated regex pattern
Repeated regex pattern

Time:10-19

I'm in need to make a regex to match some String formats, I'm not really good to Regex, that is why I'm asking for your help.

I receive a bank of Strings, they start with the following format:

Examples: "1ASF1-1-A42-A_4-214A-GarbageText", "DI-21f-112rf-A-214_124_12412A_GarbageText", "312c_12412_1241-12rf-001-GarbageText"

An AlfaNumeric that has from 1 - 20 characters ( - or _) repeated any times (can't know how many repeatings) Then it has some garbage text.

How can I make a regex to find if the String starts with the pattern I want? I think It would be something like:

[a-zA-Z0-9]{1,20}[_-] 

CodePudding user response:

What've you've written means the string must end in one-to-many - or _ characters, not have just one, then "garbage text".

You need to group the repeated pattern.

Then if you have "garbage text", then you can use .

([a-zA-Z0-9]{1,20}[_-]) . 

But this will include spaces and any other symbols rather than just [a-zA-Z0-9]... \w might be "safer"

CodePudding user response:

If you want to match the whole string, you can start with the pattern 1-20 chars, and then optionally repeat - or _ followed by again the pattern 1-20 chars.

[a-zA-Z0-9]{1,20}(?:[-_][a-zA-Z0-9]{1,20})*

Regex demo

If you want the match until the last occurrence of - or _ when there are not consecutive dashes or underscores like --:

[a-zA-Z0-9]{1,20}(?:[-_][a-zA-Z0-9]{1,20})*[_-]

Regex demo

If they can also be mixed:

[a-zA-Z0-9](?:[-_a-zA-Z0-9]*[-_])

Regex demo

  • Related