I have the following regular expressions that extract everything after first two alphabets
^[A-Za-z]{2})(\w )($) $2
now I want to the extract nothing if the data doesn't start with alphabets.
Example:
AA123 -> 123
123 -> ""
Can this be accomplished by regex?
CodePudding user response:
Introduce an alternative to match any one or more chars from start to end of string if your regex does not match:
^(?:([A-Za-z]{2})(\w )|. )$
See the regex demo. Details:
^
- start of string(?:
- start of a container non-capturing group:([A-Za-z]{2})(\w )
- Group 1: two ASCII letters, Group 2: one or more word chars|
- or.
- one or more chars other than line break chars, as many as possible (use[\w\W]
to match any chars including line break chars)
)
- end of a container non-capturing group$
- end of string.
CodePudding user response:
Your pattern already captures 1 or more word characters after matching 2 uppercase chars. The $
does not have to be in a group, and this $2
should not be in the pattern.
^[A-Za-z]{2})(\w )$
See a regex demo.
Another option could be a pattern with a conditional, capturing data in group 2 only if group 1 exist.
^([A-Z]{2})?(?(1)(\w )|. )$
^
Start of string([A-Z]{2})?
Capture 2 uppercase chars in optional group 1(?
Conditional(1)(\w )
If we have group 1, capture 1 word chars in group 2|
Or.
Match the whole line with at least 1 char to not match an empty string
)
Close conditional$
End of string
For a match only, you could use other variations Using \K
like ^[A-Za-z]{2}\K\w $ or with a lookbehind assertion (?<=^[A-Za-z]{2})\w $