I have several lines of text. I want to extract the number after specific word using awk.
I tried the following code but it does not work.
At first, create the test file by: vi test.text
. There are 3 columns (the 3 fields are generated by some other pipeline commands using awk).
Index AllocTres CPUTotal
1 cpu=1,mem=256G 18
2 cpu=2,mem=1024M 16
3 4
4 cpu=12,gres/gpu=3 12
5 8
6 9
7 cpu=13,gres/gpu=4,gres/gpu:ret6000=2 20
8 mem=12G,gres/gpu=3,gres/gpu:1080ti=1 21
Please note there are several empty fields in this file.
what I want to achieve only keep the number folloing the first gres/gpu
part and remove all cpu=
and mem=
parts using a pipeline like: cat test.text | awk '{some_commands}'
to output 3 columns:
Index AllocTres CPUTotal
1 18
2 16
3 4
4 3 12
5 8
6 9
7 4 20
8 3 21
CodePudding user response:
1st solution: With your shown samples, please try following GNU awk
code. This takes care of spaces in between fields.
awk '
FNR==1{ print; next }
match($0,/[[:space:]] /){
space=substr($0,RSTART,RLENGTH-1)
}
{
match($2,/gres\/gpu=([0-9] )/,arr)
match($0,/^[^[:space:]] [[:space:]] [^[:space:]] ([[:space:]] )/,arr1)
space1=sprintf("%"length($2)-length(arr[1])"s",OFS)
if(NF>2){ sub(OFS,"",arr1[1]);$2=space arr[1] space1 arr1[1] }
}
1
' Input_file
Output will be as follows for above code with shown samples:
Index AllocTres CPUTotal
1 18
2 16
3 4
4 3 12
5 8
6 9
7 4 20
8 3 21
2nd solution: If you don't care of spaces then try following awk
code.
awk 'FNR==1{print;next} match($2,/gres\/gpu=([0-9] )/,arr){$2=arr[1]} 1' Input_file
Explanation: Adding detailed explanation for above code.
awk ' ##Starting awk program from here.
FNR==1{ ##Checking condition if this is first line then do following.
print ##Printing current line.
next ##next will skip all further statements from here.
}
match($2,/gres\/gpu=([0-9] )/,arr){ ##using match function to match regex gres/gpu= digits and keeping digits in capturing group.
$2=arr[1] ##Assigning 1st value of array arr to 2nd field itself.
}
1 ##printing current edited/non-edited line here.
' Input_file ##Mentioning Input_file name here.
CodePudding user response:
Using sed
$ sed 's~\( \ \)[^,]*,\(gres/gpu=\([0-9]\)\|[^ ]*\)[^ ]* \ ~\1\3 \t\t\t\t ~' input_file
Index AllocTres CPUTotal
1 18
2 16
3 4
4 3 12
5 8
6 9
7 4 20
8 3 21