Home > Software design >  Optional components in java regex
Optional components in java regex

Time:09-21

I've been writing regex for long time and covering below scenarios by writing two regexes, since I do not know if there's a way to handle it by a single regex. So, I would like to hear if there's a way to write a single regex to capture the both at one shot.

Suppose that we have a standard starting with A and ending with Z, the field delimiter is a pipe | and each field consist of components delimited by a hat ^.

  • Input1: A|1|1^^3^4^5|loongText|Z
  • Input2: A|13|^2^|loongText|Z

The regex should give below output

  • Output1 : captured groups 1,,3,4,5
  • Output2 : captured groups ,2,,,

My attempt : A\|.\d*\|(.*)\^(.*)\^(.*)\^(.*)\^(.*?)\|. ?\|Z works for the first input but not the second.

What regex matches both inputs and gets the groups in correct order ?

[UPDATE] Group order is important. So group 1 should be 1, group 2 should be returning an empty and 2 in respectively for input 1 and input 2. Because based on the order they have different meanings in the standard.

  • Input3: A|13|1^2^3|loongText|Z
  • Expected output: {"group1" :1, "group2": 2, "group3": 3}, so having captures in the right group is also important.

CodePudding user response:

I'm sharing this onbehalf of @MikeM, who answered originaly to this question.

A\|\d*\|(?:(\d*)\^?)?(?:(\d*)\^?)?(?:(\d*)\^?)?(?:(\d*)\^?)?(?:(\d*))\|. ?\|Z

This regex matches all 3 inputs in the right group order. Thanks.

  • Related