I'm trying to write a regular expression to capture the job run times of a Spring Batch job but I am stuck. Below are a few examples of what the log for a spring batch job runtime could look like. I have also put down what I currently have but the regex I have is confused when the job may run for over a minute. Any help here would be appreciated. My end result i'm trying to get to is a panel in splunk that shows the average daily run time.
Also before anyone asks i have been using regex101 for the past couple days and still not getting good results. Figured stack community might be able to help!
Possible Job times formats:
709ms
59s709ms
1m59s709ms
My current query only works for the first two examples above. Also is there a way to get an expression where i don't need to put a number at the end of my capture group?
Current Regex Query:
(?<jobRunTimeMs1>\d*)ms?|(?<jobRunTimeS2>\d*)s?(?<jobRunTimeMs2>\d*)ms?|(?<jobRunTimeM3>\d*)m?(?<jobRunTimeS3>\d*)s?(?<jobRunTimeMs3>\d*)ms?
CodePudding user response:
If you pull the extra question marks from your regex, it runs as expected:
| rex field=_raw "(?<jobRunTimeMs1>\d )ms|(?<jobRunTimeS2>\d )s(?<jobRunTimeMs2>\d )ms|(?<jobRunTimeM3>\d )m(?<jobRunTimeS3>\d )s(?<jobRunTimeMs3>\d )ms"
Append a couple coalesce
s to bring them together, and drop the extraneous fields with fields
:
| eval ms=coalesce(ms1,ms2,ms3), s=coalesce(s2,s3), m=m3
| fields - ms1 ms2 ms3 s2 s3 m3
However, I generally prefer to run sequential individual extractions (especially when the format may vary, as yours does) for readability (and not needing to do the coalesce
step afterwards):
| rex field=_raw "(?<minutes>\d )m\d"
| rex field=_raw "m?(?<seconds>\d )s"
| rex field=_raw "s?(?<milliseconds>\d )ms"
CodePudding user response:
I think you need
(?:(?<jobRunTimeM3>\d )m)?(?:(?<jobRunTimeS2>\d )s)?(?<jobRunTimeMs2>\d )ms?
See the regex demo. Here, the regex matches
(?:(?<jobRunTimeM3>\d )m)?
- an optional one or more digits captured intojobRunTimeM3
group and then anm
char(?:(?<jobRunTimeS2>\d )s)?
- one or more digits captured intojobRunTimeS2
group and then ans
char(?<jobRunTimeMs2>\d )
- captures one or more digits intojobRunTimeMs2
group and then matchesms?
-m
orms
.