I am trying to extract the text between the two strings using the following regex.
(?s)Non-terminated Pods:.*?in total.\R(.*)(?=Allocated resources)
This regex looks fine in regex101 but somehow does not print the pod details when used with perl
or grep -P
. Below command results in empty output.
kubectl describe node |perl -le '/(?s)Non-terminated Pods:.*?in total.\R(.*)(?=Allocated resources)/m; printf "$1"'
Here is the sample input:
PodCIDRs: 10.233.65.0/24
Non-terminated Pods: (7 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
--------- ---- ------------ ---------- --------------- ------------- ---
default foo 0 (0%) 0 (0%) 0 (0%) 0 (0%) 105s
kube-system nginx-proxy-kube-worker-1 25m (1%) 0 (0%) 32M (1%) 0 (0%) 9m8s
kube-system nodelocaldns-xbjp8 100m (5%) 0 (0%) 70Mi (4%) 170Mi (10%) 7m4s
Allocated resources:
Question:
- how to extract the info from the above output, to look like below. What is wrong in the regex or the command that I am using?
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
--------- ---- ------------ ---------- --------------- ------------- ---
default foo 0 (0%) 0 (0%) 0 (0%) 0 (0%) 105s
kube-system nginx-proxy-kube-worker-1 25m (1%) 0 (0%) 32M (1%) 0 (0%) 9m8s
kube-system nodelocaldns-xbjp8 100m (5%) 0 (0%) 70Mi (4%)
Question-2: What if I have two blocks of similar inputs. How to extract the pod details ? Eg:
if the input is:
PodCIDRs: 10.233.65.0/24
Non-terminated Pods: (7 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
--------- ---- ------------ ---------- --------------- ------------- ---
default foo 0 (0%) 0 (0%) 0 (0%) 0 (0%) 105s
kube-system nginx-proxy-kube-worker-1 25m (1%) 0 (0%) 32M (1%) 0 (0%) 9m8s
kube-system nodelocaldns-xbjp8 100m (5%) 0 (0%) 70Mi (4%) 170Mi (10%) 7m4s
Allocated resources:
....some
.......random data...
PodCIDRs: 10.233.65.0/24
Non-terminated Pods: (7 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
--------- ---- ------------ ---------- --------------- ------------- ---
default foo-1 0 (0%) 0 (0%) 0 (0%) 0 (0%) 105s
kube-system nginx-proxy-kube-worker-2 25m (1%) 0 (0%) 32M (1%) 0 (0%) 9m8s
kube-system nodelocaldns-xbjp3-2 100m (5%) 0 (0%) 70Mi (4%) 170Mi (10%) 7m4s
Allocated resources:
CodePudding user response:
With some obvious assumptions, and keeping it close to the pattern in the question:
perl -0777 -wnE'
@pods = /Non-terminated\s Pods:\s \([0-9] \s in\s total\)\n(.*?)\nAllocated resources:/gs;
say for @pods
' input-file
(note modifiers on this wide line: /gs
)
It is not stated in the question how precisely is that regex "used with perl
".
When I use the regex from the question verbatim, instead of the one used in this answer, it works (and without the /s
modifier, as it should). To work with multiple such blocks in a file and with other text in between we need to change its (.*)
to (.*?)
.
Explanation of the command-line program above:
the
-0777
switch makes it read the file whole into a string, available in the program in the variable$_
, on which the regex is applied by default
(the switch-g
is available as an alias for-0777
, starting with 5.36.0)we still need the
-n
switch so that the program iterates over the "lines" of input (STDIN
or a file). In this case the input record separator has been undefined so there is just one "line"the regex captures are assigned to the array
@pods
, for further processing
CodePudding user response:
Using gnu-grep
you can use your regex with some tweaks:
kubectl describe node |
grep -zoP '(?s)Non-terminated Pods:.*?in total.\R\K(.*?)(?=Allocated resources)'
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
--------- ---- ------------ ---------- --------------- ------------- ---
default foo 0 (0%) 0 (0%) 0 (0%) 0 (0%) 105s
kube-system nginx-proxy-kube-worker-1 25m (1%) 0 (0%) 32M (1%) 0 (0%) 9m8s
kube-system nodelocaldns-xbjp8 100m (5%) 0 (0%) 70Mi (4%) 170Mi (10%) 7m4s
- Used
\K
(match reset) after\R
to remove that line from output - Used
-z
option to treat treat input and output data as sequences of lines, each terminated by a zero byte (the ASCII NUL character) instead of a newline.
PS: Same regex will work with second input block as well with header line shown before each block.
Alternatively you can use any version sed
for this job as well:
kubectl describe node |
sed -n '/Non-terminated Pods:.*in total.*/,/Allocated resources:/ {//!p;}'
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
--------- ---- ------------ ---------- --------------- ------------- ---
default foo 0 (0%) 0 (0%) 0 (0%) 0 (0%) 105s
kube-system nginx-proxy-kube-worker-1 25m (1%) 0 (0%) 32M (1%) 0 (0%) 9m8s
kube-system nodelocaldns-xbjp8 100m (5%) 0 (0%) 70Mi (4%) 170Mi (10%) 7m4s