I am stuck with this homework, need to find most common commands in a text link.
In Linux, text processing and text analysis are essential. The following file contains the history of the commands of one of IT Slovenia’s system administrators. https://raw.githubusercontent.com/nuwanarti/Assignment1/main/data1 Please find the top three commands which are frequently shown in the file and save the result in ~/result.txt Tasks:
- Process the text data.
- Write the result to ~/result.txt
- Ensure that results include the number of times and commands, such as "100 ls".
CodePudding user response:
Assumptions:
- OP is only interested in the initial command in any pipelined commands (eg,
ps -ef | grep rabbitmq
leads tops
being counted once, andgrep
is ignored) - OP has access to
GNU awk
(for thePROCINFO["sorted_in"]
option) - the history (from that github link) is stored in file
history.dat
One awk
solution:
awk '
{ counts[$2] }
END { PROCINFO["sorted_in"]="@val_num_desc"
for (i in counts) {
print counts[i],i
if ( pass >= 3)
break
}
}
' history.dat
This generates:
199 tail
150 openstack
114 systemctl
CodePudding user response:
You can use history
combined with awk
, filter the results with uniq
to get rid of the duplicates, then head -3
to get the top 3 results and finally redirect the output to ~/result.txt
:
history | awk '{print $2}' | awk 'BEGIN {FS="|"} {print $1}' \
| sort | uniq -c | sort -rn | head -3 > ~/result.txt
Of course you need to switch from history
to cat data1.txt
if you are reading the results from a file.