I have a log file in that format:
2021-05-17 07:59:10.496 33821 ERROR bla bla bla
2021-05-17 08:03:10.957 33821 WARNING bla bla bla
2021-05-17 08:12:10.094 33821 ERROR bla bla bla
2021-05-17 08:40:10.592 33821 INFO bla bla bla
I need to count the number of level-messages (ERROR,WARNING,INFO) separately with time intervals of 4 hours. Now, I was able to count the number of messages of each type of the entire log file, but lack the knowledge of how to count the number in time intervals every 4 hours. Write a script in bash and sort it with awk:
awk '($4 ~ /INFO/)' $file | awk '{print $4}' | uniq -c | sort -r
similarly with error and warning
CodePudding user response:
Here's a script in bash that uses awk to count the number of level-messages (ERROR, WARNING, INFO) separately with time intervals of 4 hours:
#!/bin/bash
file="your_log_file.log"
# Get the current timestamp
current_timestamp=$(date %s)
# Get the timestamp for 4 hours ago
four_hours_ago=$((current_timestamp - 14400))
# Get the number of INFO messages in the last 4 hours
info_count=$(awk -v d="$(date --date='4 hours ago' '%Y-%m-%d %T')" '$0 > d && /INFO/ {count } END {print count}' $file)
# Get the number of ERROR messages in the last 4 hours
error_count=$(awk -v d="$(date --date='4 hours ago' '%Y-%m-%d %T')" '$0 > d && /ERROR/ {count } END {print count}' $file)
# Get the number of WARNING messages in the last 4 hours
warning_count=$(awk -v d="$(date --date='4 hours ago' '%Y-%m-%d %T')" '$0 > d && /WARNING/ {count } END {print count}' $file)
# Print the counts
echo "INFO: $info_count"
echo "ERROR: $error_count"
echo "WARNING: $warning_count"
CodePudding user response:
First get the hour multiple of 4 before the event: int(substr($2,1,2)/4) * 4
(e.g. for 07:59:10
this returns 4
). Then format it nicely, e.g. to print 04:00-07:59
, and then sort
everything and uniq -c
as you are doing already:
awk '($4 ~ /INFO/)' $file |
awk '{
x = int(substr($2,1,2)/4) * 4;
printf "%s d:00-d:59 %s\n", $1, x, x 3, $4
}' |
sort |
uniq -c
This will print all the counts sorted by 4-hour intervals, e.g. for your example (with all lines, cat $file
instead of awk '($4 ~ /INFO/)' $file
) it gives:
1 2021-05-17 04:00-07:59 ERROR
1 2021-05-17 08:00-11:59 ERROR
1 2021-05-17 08:00-11:59 INFO
1 2021-05-17 08:00-11:59 WARNING