I need to capture 4 groups from:
John.7200_24.6.txt.gz
Output:
Group1: John
Group2: 7200
Group3: 24
Group4: 6
Here is my regex: ([^.|_|data|gz] )
It captures a single group with multiple matches. How can I fix it?
CodePudding user response:
This pattern ([^.|_|data|gz] )
can be written as ([^._datagz|] )
which uses a negated character class to match 1 chars other than the single chars listed.
You use a single capture group to split on, if you want 4 separate groups, you should create 4 groups and match instead of split.
^(\w )\.(\d )_(\d )\.(\d )
^
Start of string(\w )\.
Capture 1 word chars in group 1 and match.
(\d )_
Capture 1 digits in group 2 and match_
(\d )\.
Capture 1 digitsin group 3 and match.
(\d )
Capture 1 digits in group 4
Or matching the full example string:
^(\w )\.(\d )_(\d )\.(\d )\.\w \.gz$