I have filenames in format <pod-name>_<namespace-name>_<container-name>-<dockerid>.log
For example:
pod-name_namespace-name_container-name-7a1d0ed5675bdb365228d43f470fcee20af5c8bea84dd6d886b9bf837a9d358c.log
pod-name_namespace-name-1234567890_container-name-7a1d0ed5675bdb365228d43f470fcee20af5c8bea84dd6d886b9bf837a9d358c.log
Actually this is the k8s container's log files.
The namespace-name
may contain numeric postfix that represents automation system run id (github.run_id
- 10 digits number).
I need to parse filenames with regex to extract pod name, namespace name without run id, run id, container name and docker id.
Regex based on default fluentbit kubernetes parser that I need to change for our usage:
(?<pod_name>[a-z0-9](?:[-a-z0-9]*[a-z0-9])?(?:\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?<namespace_name>[^_] )(-(?<run_id>\d{10,}))_(?<container_name>. )-(?<docker_id>[a-z0-9]{64})\.log$
https://rubular.com/r/CROBxpHHgX5UZx
The regex above parses well filenames that contains namespace with run id, but fails to parse namespace without run id:
pod-name_namespace-name_container-name-7a1d0ed5675bdb365228d43f470fcee20af5c8bea84dd6d886b9bf837a9d358c.log
https://rubular.com/r/6MSQsnuGzrkVJG
In this case the run_id
should be empty string
How to fix it that it match both cases?
CodePudding user response:
You can use
(?<pod_name>[a-z0-9](?:[-a-z0-9]*[a-z0-9])?(?:\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?<namespace_name>[^_] ?)(-(?<run_id>\d{10,}))?_(?<container_name>. )-(?<docker_id>[a-z0-9]{64})\.log$
See the regex demo.
The main point is to make two changes in (?<namespace_name>[^_] )(-(?<run_id>\d{10,}))
part:
- make the
[^_]
pattern lazy, so that it could match as few chars other than_
as possibe, i.e. add a?
after - make the
(-(?<run_id>\d{10,}))
part optional by adding a?
quantifier after the group.