Search level-errors with time range in log file-CodePudding

I have a log file in that format:

2021-05-17 07:59:10.496 33821 ERROR bla bla bla

2021-05-17 08:03:10.957 33821 WARNING bla bla bla

2021-05-17 08:12:10.094 33821 ERROR bla bla bla

2021-05-17 08:40:10.592 33821 INFO bla bla bla

I need to count the number of level-messages (ERROR,WARNING,INFO) separately with time intervals of 4 hours. Now, I was able to count the number of messages of each type of the entire log file, but lack the knowledge of how to count the number in time intervals every 4 hours. Write a script in bash and sort it with awk:

awk '($4 ~ /INFO/)' $file | awk '{print $4}' | uniq -c | sort -r

similarly with error and warning

CodePudding user response：

Here's a script in bash that uses awk to count the number of level-messages (ERROR, WARNING, INFO) separately with time intervals of 4 hours:

#!/bin/bash

file="your_log_file.log"

# Get the current timestamp
current_timestamp=$(date  %s)

# Get the timestamp for 4 hours ago
four_hours_ago=$((current_timestamp - 14400))

# Get the number of INFO messages in the last 4 hours
info_count=$(awk -v d="$(date --date='4 hours ago'  '%Y-%m-%d %T')" '$0 > d && /INFO/ {count  } END {print count}' $file)

# Get the number of ERROR messages in the last 4 hours
error_count=$(awk -v d="$(date --date='4 hours ago'  '%Y-%m-%d %T')" '$0 > d && /ERROR/ {count  } END {print count}' $file)

# Get the number of WARNING messages in the last 4 hours
warning_count=$(awk -v d="$(date --date='4 hours ago'  '%Y-%m-%d %T')" '$0 > d && /WARNING/ {count  } END {print count}' $file)

# Print the counts
echo "INFO: $info_count"
echo "ERROR: $error_count"
echo "WARNING: $warning_count"

CodePudding user response：

First get the hour multiple of 4 before the event: int(substr($2,1,2)/4) * 4 (e.g. for 07:59:10 this returns 4). Then format it nicely, e.g. to print 04:00-07:59, and then sort everything and uniq -c as you are doing already:

awk '($4 ~ /INFO/)' $file |
    awk '{
        x = int(substr($2,1,2)/4) * 4;
        printf "%s d:00-d:59 %s\n", $1, x, x 3, $4
    }' |
    sort |
    uniq -c

This will print all the counts sorted by 4-hour intervals, e.g. for your example (with all lines, cat $file instead of awk '($4 ~ /INFO/)' $file) it gives:

      1 2021-05-17 04:00-07:59 ERROR
      1 2021-05-17 08:00-11:59 ERROR
      1 2021-05-17 08:00-11:59 INFO
      1 2021-05-17 08:00-11:59 WARNING