How do I print the highest/longest values in a file-CodePudding

I have an AV log file showing a number of values for each process scanned: Name, Path, Total files scanned, Scan time. The file contains hundreds of these process entries (example below) and for Total files scanned and Scan time I'd like to sort and print the highest (or longest) values so I can determine which processes are impacting the system. I've tried various ways with grep but only seem to get a list running in numerical order, when what I really want is to say Process id: 86, Scan time (ns): 12761174 is the highest, then Process id 25, etc. Hope my explanation is clear enough.

Process id: 25
Name: wwww
Path: "/usr/libexec/wwww"
Total files scanned: 42
Scan time (ns): "62416"
Status: Active

Process id: 7
Name: xxxx
Path: "/usr/libexec/xxxx"
Total files scanned: 0
Scan time (ns): "0"
Status: Active

Process id: 86
Name: yyyy
Path: "/usr/libexec/yyyy"
Total files scanned: 2
Scan time (ns): "12761174"
Status: Active

I have tried:

grep -Eo | grep 'Scan time (ns)' '[0-9] ' file | sort

Which results in:

file:Scan time (ns): "9391986"
file:Scan time (ns): "9532119"
file:Scan time (ns): "9730650"
file:Scan time (ns): "9743828"
file:Scan time (ns): "9793469"
file:Scan time (ns): "9911768"

What I am wanting to achieve is something such as:

Process id 9, Scan time (ns): "34561"
Process id 86, Scan time (ns): "45630"
Process id 25, Scan time (ns): "1256822"
Process id 51, Scan time (ns): "52351290"
Process id 30, Scan time (ns): "90257651"
Process id 19, Scan time (ns): "178764794932"

CodePudding user response：

Using perl to read the records one at a time (Using "paragraph mode" which uses a blank line as a record seperator), extract the time, and sort in reverse order by it:

$ perl -00 -lne 'm/Scan time \(ns\):\s "(\d )"/ && push @procs, [ $_, $1 ];
                 END { print $_->[0] for sort { $a->[1] < $b->[1] } @procs }' input.txt
Process id: 86
Name: yyyy
Path: "/usr/libexec/yyyy"
Total files scanned: 2
Scan time (ns): "12761174"
Status: Active

Process id: 25
Name: wwww
Path: "/usr/libexec/wwww"
Total files scanned: 42
Scan time (ns): "62416"
Status: Active

Process id: 7
Name: xxxx
Path: "/usr/libexec/xxxx"
Total files scanned: 0
Scan time (ns): "0"
Status: Active

CodePudding user response：

With awk and GNU sort.

Define two newlines as line separator (RS="\n\n") and output desired fields to sort command. $(NF-2) contains second last field in a line.

awk 'BEGIN{RS="\n\n"} {print "Process id", $3 ", Scan time (ns):", $(NF-2)}' file | sort -t '"' -n -k 2

Output:

Process id 7, Scan time (ns): "0"
Process id 25, Scan time (ns): "62416"
Process id 86, Scan time (ns): "12761174"

See: 8 Powerful Awk Built-in Variables – FS, OFS, RS, ORS, NR, NF, FILENAME, FNR

CodePudding user response：

With your shown samples please try following awk code. Written and tested in GNU awk.

awk '
/^Process id: /{
  val=$NF
  next
}
/^Scan time \(ns\): "/{
  arr[val]=$NF
}
END{
  PROCINFO["sorted_in"]="@ind_num_asc"
  for(i in arr){
    print "Process id " i ", Scan time (ns): " arr[i]""
}
}
'  Input_file

CodePudding user response：

Try this:

grep -P '^Pr.*|^Sc.*' file.txt| sed -z 's/\nS/, S/g' | sort -t '"' -k2n

file.txt content:

Process id: 25
Name: wwww
Path: "/usr/libexec/wwww"
Total files scanned: 42
Scan time (ns): "62416"
Status: Active

Process id: 7
Name: xxxx
Path: "/usr/libexec/xxxx"
Total files scanned: 0
Scan time (ns): "0"
Status: Active

Process id: 86
Name: yyyy
Path: "/usr/libexec/yyyy"
Total files scanned: 2
Scan time (ns): "12761174"
Status: Active

Output:

Process id: 7, Scan time (ns): "0"
Process id: 25, Scan time (ns): "62416"
Process id: 86, Scan time (ns): "12761174"