I am using bash script to extract some information from log files located within the directory and save the summary in the separate file. In the bottom of each log file, there is a table like:
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
----- ------------ ---------- ----------
1 -6.961 0 0
2 -6.797 2.908 4.673
3 -6.639 27.93 30.19
4 -6.204 2.949 6.422
5 -6.111 24.92 28.55
6 -6.058 2.836 7.608
7 -5.986 6.448 10.53
8 -5.95 19.32 23.99
9 -5.927 27.63 30.04
10 -5.916 27.17 31.29
11 -5.895 25.88 30.23
12 -5.835 26.24 30.36
from this I need to take only the value from the second column of the first line (-6.961) and add it together with the name of the log as one string in new ranking_${output}.log
log_name -6.961
so for 5 processed logs it should be something like:
# ranking_${output}.log
log_name1 -X.XXX
log_name2 -X.XXX
log_name3 -X.XXX
log_name4 -X.XXX
log_name5 -X.XXX
Here is a simple bash workflow, which takes ALL THE LINES from ranking table and saves it together with the name of the LOG file:
#!/bin/bash
home="$PWD"
#folder contained all *.log files
results="${home}"/results
# loop each log file and take its name all the ranking table
for log in ${results}/*.log; do
log_name=$(basename "$log" .log)
echo "$log_name" >> ${results}/ranking_${output}.log
cat $log | tail -n 12 >> ${results}/ranking_${output}.log
done
Could you suggest me an AWK routine which would select only the top value located on the first line of each table? This is an AWK example that I had used for another format, which does not work there:
awk -F', *' 'FNR==2 {f=FILENAME;
sub(/.*\//,"",f);
sub(/_.*/ ,"",f);
printf("%s: %s\n", f, $5) }' ${results}/*.log >> ${results}/ranking_${output}.log
CodePudding user response:
With awk
. If first column contains 1
print filename and second column to file output
:
awk '$1=="1"{print FILENAME, $2}' *.log > output
Update to remove path and suffix (.log):
awk '$1=="1"{sub(/.*\//,"",FILENAME); sub(/\.log/,"",FILENAME); print FILENAME, $2}'