Home > Blockchain >  How to remove some words in specific field using awk?
How to remove some words in specific field using awk?

Time:06-11

I have several lines of text. I want to extract the number after specific word using awk.

I tried the following code but it does not work.

At first, create the test file by: vi test.text. There are 3 columns (the 3 fields are generated by some other pipeline commands using awk).

Index  AllocTres                              CPUTotal
1      cpu=1,mem=256G                         18
2      cpu=2,mem=1024M                        16
3                                             4
4      cpu=12,gres/gpu=3                      12
5                                             8
6                                             9
7      cpu=13,gres/gpu=4,gres/gpu:ret6000=2   20
8      mem=12G,gres/gpu=3,gres/gpu:1080ti=1   21

Please note there are several empty fields in this file. what I want to achieve only keep the number folloing the first gres/gpu part and remove all cpu= and mem= parts using a pipeline like: cat test.text | awk '{some_commands}' to output 3 columns:

Index  AllocTres                              CPUTotal
1                                             18
2                                             16
3                                             4
4      3                                      12
5                                             8
6                                             9
7      4                                      20
8      3                                      21

CodePudding user response:

1st solution: With your shown samples, please try following GNU awk code. This takes care of spaces in between fields.

awk '
FNR==1{ print; next }
match($0,/[[:space:]] /){
  space=substr($0,RSTART,RLENGTH-1)
}
{
  match($2,/gres\/gpu=([0-9] )/,arr)
  match($0,/^[^[:space:]] [[:space:]] [^[:space:]] ([[:space:]] )/,arr1)
  space1=sprintf("%"length($2)-length(arr[1])"s",OFS)
  if(NF>2){ sub(OFS,"",arr1[1]);$2=space arr[1] space1 arr1[1] }
}
1
'   Input_file

Output will be as follows for above code with shown samples:

Index  AllocTres                              CPUTotal
1                                             18
2                                             16
3                                             4
4      3                                      12
5                                             8
6                                             9
7      4                                      20
8      3                                      21


2nd solution: If you don't care of spaces then try following awk code.

awk 'FNR==1{print;next} match($2,/gres\/gpu=([0-9] )/,arr){$2=arr[1]} 1' Input_file

Explanation: Adding detailed explanation for above code.

awk '             ##Starting awk program from here.
FNR==1{           ##Checking condition if this is first line then do following.
  print           ##Printing current line.
  next            ##next will skip all further statements from here.
}
match($2,/gres\/gpu=([0-9] )/,arr){  ##using match function to match regex gres/gpu= digits and keeping digits in capturing group.
  $2=arr[1]       ##Assigning 1st value of array arr to 2nd field itself.
}
1                 ##printing current edited/non-edited line here.
' Input_file      ##Mentioning Input_file name here.

CodePudding user response:

Using sed

$ sed 's~\( \ \)[^,]*,\(gres/gpu=\([0-9]\)\|[^ ]*\)[^ ]* \ ~\1\3 \t\t\t\t      ~' input_file
Index  AllocTres                              CPUTotal
1                                             18
2                                             16
3                                             4
4      3                                      12
5                                             8
6                                             9
7      4                                      20
8      3                                      21
  • Related