Show the 20 most popular urls (http request and requested path) and their I would like my script to print:
185 GET /
81 GET /stations/dd
52 GET /stations/dsfds
47 GET /stations/asfcdsf
FILE :
37.139.53.70 - - [25/Nov/2022:15:49:35 0200] "POST /index.php?option=com_users&task=registration.register HTTP/1.0" 303 246 "http://meteoclima.hua.gr/index.php?option=com_users&view=registration" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.93 Safari/537.36"
37.139.53.70 - - [25/Nov/2022:15:49:36 0200] "GET /index.php?option=com_users&view=registration&layout=complete HTTP/1.0" 200 33080 "http://meteoclima.hua.gr/index.php?option=com_users&view=registration&layout=complete" "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.93 Safari/537.36"
commands : sort, uniq, grep
printf "%s\n" | sort file | awk '{ print $6 }' | uniq -c
1984 "GET
1 "-"
132 "GET
2 "POST
24 "GET
2 "-"
29 "GET
4 "-"
14 "GET
2 "-"
182 "GET
1 "-"
35 "GET
2 "-"
397 "GET
1 "-"
257 "GET
1 "-"
69 "GET
1 "-"
51 "GET
11 "POST
CodePudding user response:
With awk
:
awk -F'[" ]' '{sub("?.*", "", $8); arr[$7" "$8] }
END{for (i in arr) print i, arr[i]}' file | sort | uniq -c | sort -nr | head -20
With grep
:
grep -oP '(?:GET|POST|PUT|DELETE|HEAD|CONNECT|OPTIONS|TRACE|PATCH) /[^\?\s] ' file |
sort | uniq -c | sort -nr | head -20