I am trying to extract parameter definitions from a Jenkins script and can't work out an appropriate regex (I' working in Dyalog APL which supports PCRE8).
Here's how the subject looks like:
pipeline {
agent none
parameters {
string(name: 'foo', defaultValue: 'bar')
string(name: 'goo', defaultValue: 'hoo')
}
stages {
stage('action') {
steps {
echo "foo = ${params.foo}"
}
}
}
}
I would like to get the individual param definitions captured in group 1 (in other words: I'm looking for a results that reports two matches: string(name: 'foo', defaultValue: 'bar')
and string(name: 'goo', defaultValue: 'hoo')
), but the matches are either too long or too short (depending on greediness).
My regex:
parameters\s*{(\s*\D*\(.*\)\s*)*}
(dot matches nl)
Parameter types may vary, so my best idea was to use \D*
for those (any # of non-digits). I am suspicious that this captures more than I expected - but replacing that with \w
did not help.
An alternative idea was
parameters\s*{(\s*(\w*)\(([^\)]*)\))*\s*}
which seemed more precise wrt matching parameter types and also the content of the parens - but surprisingly that returned goo
only and skipped foo
.
What am I missing?
CodePudding user response:
Using PCRE
you can use this regex in MULTILINE
mode:
(?m)(?:^\h*parameters\h*{|(?!^)\G).*\R\h*\w \(\w :\h*'\K[^']
RegEx Details:
(?m)
: Enable MULTILINE mode(?:
: Start non-capture group^\h*parameters\h*{
: Match a line that starts withparameters {
|
: OR(?!^)\G
:
)
: End non-capture group.*
: Match anything\R
: Match a line break\h*
: Match 0 or more whitespaces\w
: Match 1 word chars\(
: Match(
\w
: Match 1 word chars:
: Match a:
\h*
: Match 0 or more whitespaces'
: Match a'
\K
: Reset all the matched info[^']
: Match 1 of any char that is not'
(this is our parameter name)