How to extract the number after specific word using awk?-CodePudding

I have several lines of text. I want to extract the number after specific word using awk.

I tried the following code but it does not work.

At first, create the test file by: vi test.text. There are 3 columns (the 3 fields are generated by some other pipeline commands using awk).

Index  AllocTres                              CPUTotal
1      cpu=1,mem=256G                         18
2      cpu=2,mem=1024M                        16
3                                             4
4      cpu=12,gres/gpu=3                      12
5                                             8
6                                             9
7      cpu=13,gres/gpu=4,gres/gpu:ret6000=2   20
8      mem=12G,gres/gpu=3,gres/gpu:1080ti=1   21

Please note there are several empty fields in this file. what I want to achieve is to extract the number after the first gres/gpu= in each line (if no gres/gpu= occurs in this line, the default number is 0) using a pipeline like: cat test.text | awk '{some_commands}' to output 4 columns:

Index  AllocTres                              CPUTotal   GPUAllocated
1      cpu=1,mem=256G                         18         0
2      cpu=2,mem=1024M                        16         0
3                                             4          0
4      cpu=12,gres/gpu=3                      12         3
5                                             8          0
6                                             9          0
7      cpu=13,gres/gpu=4,gres/gpu:ret6000=2   20         4
8      mem=12G,gres/gpu=3,gres/gpu:1080ti=1   21         3

CodePudding user response：

Firstly: awk do not need cat, it could read files on its' own. Combining cat and awk is generally discouraged as useless use of cat.

For this task I would use GNU AWK following way, let file.txt content be

cpu=1,mem=256G
cpu=2,mem=1024M

cpu=12,gres/gpu=3


cpu=13,gres/gpu=4,gres/gpu:ret6000=2
mem=12G,gres/gpu=3,gres/gpu:1080ti=1

then

awk 'BEGIN{FS="gres/gpu="}{print $2 0}' file.txt

output

Explanation: I inform GNU AWK that field separator (FS) is gres/gpu= then for each line I do print 2nd field increased by zero. For lines without gres/gpu= $2 is empty string, when used in arithmetic context this is same as zero so zero plus zero gives zero. For lines with at least one gres/gpu= increasing by zero provokes GNU AWK to find longest prefix which is legal number, thus 3 (4th line) becomes 3, 4, (7th line) becomes 4, 3, (8th line) becomes 3.

(tested in GNU Awk 5.0.1)

CodePudding user response：

Using sed

$ sed '1s/$/\tGPUAllocated/;s~.*gres/gpu=\([0-9]\).*~& \t\1~;1!{\~gres/gpu=[0-9]~!s/$/ \t0/}' input_file
Index  AllocTres                              CPUTotal  GPUAllocated
1      cpu=1,mem=256G                         18        0
2      cpu=2,mem=1024M                        16        0
3                                             4         0
4      cpu=12,gres/gpu=3                      12        3
5                                             8         0
6                                             9         0
7      cpu=13,gres/gpu=4,gres/gpu:ret6000=2   20        4
8      mem=12G,gres/gpu=3,gres/gpu:1080ti=1   21        3

CodePudding user response：

With your shown samples in GNU awk you can try following code. Written and tested in GNU awk. Simple explanation would be using awk's match function where using regex gres\/gpu=([0-9] )(escaping / here) and creating one and only capturing group to capture all digits coming after =. Once match is found printing current line followed by array's arr's 1st element 0(to print zero in case no match found for any line) here.

awk '
FNR==1{
  print $0,"GPUAllocated"
  next
}
{
  match($0,/gres\/gpu=([0-9] )/,arr)
  print $0,arr[1] 0
}
' Input_file

CodePudding user response：

awk '
    BEGIN{FS="\t"} 
    NR==1{
        $(NF 1)="GPUAllocated"
    }
    NR>1{
        $(NF 1)=FS 0
    } 
    /gres\/gpu=/{
        split($0, a, "=")
        gp=a[3]; gsub(/[ ,].*/, "", gp)  
        $NF=FS gp
    }1' test.text 

Index  AllocTres                              CPUTotal GPUAllocated
1      cpu=1,mem=256G                         18        0
2      cpu=2,mem=1024M                        16        0
3                                             4         0
4      cpu=12,gres/gpu=3                      12        3
5                                             8         0
6                                             9         0
7      cpu=13,gres/gpu=4,gres/gpu:ret6000=2   20        4
8      mem=12G,gres/gpu=3,gres/gpu:1080ti=1   21        3