Regex need to consider two patterns in same group
sample data ::
mixexecutor:check_atom_exists:740 - requested to check this machine : **ET_colBackDDW_Temp**
output_of_reports/PII/36478_**ABP_BAL_liquidpressure**-**20210831-123456**-**20210831-172355**.bat.yz
Both the data belongs to same column need to identity highlighted values
Expected output:
**ET_colBackDDW_Temp**
--> group 1**ABP_BAL_liquidpressure**
--> group 1,20210831-123456
--> group 2,20210831-172355
--> group 3
I have tried like below while developing the regex no need to consider the words
^. :(. )$|(?:[0-9]{4,5}_|-)((?:[a-zA-Z0-9_]*)(?:-[0-9]{1,7})?)-([0-9]{8}-[0-9]{6})-([0-9]{8}-[0-9]{6})
for this above regex it is identified as different group I am using pyspark.
CodePudding user response:
To get the values in 1 or 3 groups using a single pattern, you might use:
^.*?([A-Z]\w*_\w )(?:-([0-9]{8}-[0-9]{6})-([0-9]{8}-[0-9]{6}))?
The pattern matches:
^
Start of string.*?
Match as least a possible chars(
Capture group 1[A-Z]\w*_
Match A-Z and optional word chars and_
\w
Match 1 word chars
)
Close group 1(?:
Non capture group-
Match literally([0-9]{8}-[0-9]{6})
Capture group 2-
match-
([0-9]{8}-[0-9]{6})
Capture group 3
)?
Close non capture group and make it optional