I am trying to have an automated script that can take the latest log entry and collect all the log entries from two hours back regardless of if there are log entries that exist during that time. The issue I keep running into for research is that all the examples I find have a date attached and I do not. A sample log output is:
13:26:28.709883 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9136:9287, ack 13044, win 420, length 151
13:26:28.713687 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9287:9522, ack 13044, win 420, length 235
13:26:28.713766 IP term-IdeaPad-Flex.46364 > unn-37-19-198-173.datapacket.com.https: Flags [.], ack 9522, win 24576, length 0
13:26:28.840650 IP term-IdeaPad-Flex.46364 > unn-37-19-198-173.datapacket.com.https: Flags [.], seq 14286:15624, ack 9522, win 24576, length 1338
13:26:28.848949 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9522:9599, ack 14286, win 420, length 77
13:26:28.849002 IP term-IdeaPad-Flex.46364 > unn-37-19-198-173.datapacket.com.https: Flags [P.], seq 15624:15674, ack 9599, win 24576, length 50
13:26:28.849023 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9599:9743, ack 14286, win 420, length 144
13:26:28.849031 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9743:10269, ack 14286, win 420, length 526
So my time is at the front with no date. And date does not like using this, giving me a date: invalid date ‘ %s’
response and does not output anything.
My current work is:
#!/bin/bash
truncate -s 0 twoHour.log
NEW=$(tail -n1 $1 | cut -d ":" -f1)
# echo $NEW
New=$(date -d "$NEW" %s)
OLD=$(($NEW-2))
New=$(date -d "$OLD" %s)
# echo $OLD
START=$(egrep "$NEW\:\d\d\:\d\d" $1 | tail | date -d %s)
END=$(egrep "$OLD\:\d\d\:\d\d" $1 | head | date -d %s)
while read line; do
# Extract the date for each line.
# First strip off everything up to the first "[".
# Then remove everything after the first "]".
# Finally, straighten up the format with the cleandate function
date="${date%%.*}"
date=$( cleandate "$date" )
# If the date falls between d1 and d2, print it
if [[ $date -ge $START && $date -le $END ]]; then
echo "$line"
fi
done
NEW and OLD are for the hours that are getting extracted. START and END are the boundaries where everything between the two gets outputted line by line. The $1 is for a log file.
I have been trying to modify bash/awk scripts and searching for any premade ones for several hours now, so I am at a loss on how to get this to work.
CodePudding user response:
sed
can be used to extract lines expressed by regexp adresses
/^11:.*$/,/^13:26:28.849031 .*$/p
First address could be further refined by getting the minutes digits and adding to the expression as
/^11:(2[6-9]|[3-5][0-9]).*$/,/^13:26:28.849031 .*$/p
last_line=$(tail -n1 test.txt)
end_time=$(cut -d ' ' -f1 <<<"$last_line")
end_hour="${end_time:0:2}"
min_msb="${end_time:3:1}"
min_next=$(($min_msb 1))
min_lsb="${end_time:4:1}"
start_hour=$(($end_hour-2))
if [ "$min_msb" -lt 5 ];then
min_next=$(($min_msb 1))
else
min_next=5
fi
sed -rn "/^$start_hour:($min_msb[$min_lsb-9]|[$min_next-5][0-9]).*$/,/^$end_time .*$/p" test.txt
If times span more than 24 h
22:57:46.709883 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9136:9287, ack 13044, win 420, length 151
23:26:28.709883 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9136:9287, ack 13044, win 420, length 151
...
00:36:28.849031 IP unn-37-19-198-173.datapacket.com.https > term-IdeaPad-Flex.46364: Flags [P.], seq 9743:10269, ack 14286, win 420, length 526
Then
last_line=$(tail -n1 test.txt)
#end_time=$(cut -d ' ' -f1 <<<"$last_line")
end_time="${last_line:0:8}"
start_time="$(date -d "$(date -d "$end_time" --iso=seconds) -2 hour" ' %T')"
#echo "$start_time - $end_time"
end_hour="$(printf "%d" ${end_time:0:2})"
min_msb="$(printf "%d" ${end_time:3:1})"
min_lsb="$(printf "%d" ${end_time:4:1})"
start_hour="${start_time:0:2}"
if [ "$min_msb" -lt 5 ];then
min_next=$(($min_msb 1))
else
min_next=5
fi
echo "sed expression: /^$start_hour:($min_msb[$min_lsb-9]|[$min_next-5][0-9]).*$/,/^$end_time .*$/p"
sed -rn "/^$start_hour:($min_msb[$min_lsb-9]|[$min_next-5][0-9]).*$/,/^$end_time.*$/p" test.txt
CodePudding user response:
Assumptions:
- the offset (2 hours in OP's example) is less than 24 hours
- every line starts with a timestamp of the format
HH:MM:SS
- the log may span multiple days
Plan:
- convert offset (eg,
2 hrs
) to seconds; we'll call thisoffset_secs
- grab time from last line of file; we'll call this
last_time
- convert the timestamp to epoch/seconds; we'll call this
last_epoch
- subtract
offset_secs
fromlast_epoch
; we'll call thisfirst_epoch
- convert
first_epoch
back into aHH:MM:SS
string; we'll call thisfirst_time
- to address the file's timestamps spanning multiple midnights we'll save the lines of interest in an array, resetting the array when we find we have another midnight to go
- during
awk/END
processing we print our array of lines to stdout
One GNU awk
idea:
$ cat log.awk
BEGIN { FS="." } # set input field delimiter to "."
# first line of input is last line of log file; grab time and calculate the offset/start time
NR==1 { last_time = $1
last_epoch = mktime( strftime("%Y %m %d") " " gensub(/:/," ","g",last_time))
first_epoch = last_epoch - offset_secs
first_time = strftime("%H:%M:%S", first_epoch)
if (first_time > last_time)
spans_midnight=1
next
}
# for the rest of the input lines determine if the time falls within the last "offset_secs"
{ curr_time = $1
if ( ( spans_midnight && curr_time >= first_time) ||
( spans_midnight && curr_time <= last_time) ||
( !spans_midnight && curr_time >= first_time && curr_time <= last_time) )
lines[ cnt]=$0
else { # outside the time range so ...
delete lines # delete anything saved up to this point and ...
cnt=0 # reset the array index
}
}
END { for (i=1;i<=cnt;i ) # print the lines that occurred within the last "offset_secs"
print lines[i]
}
NOTE: see GNU awk: Time Functions for more details on the mktime()
and strftime()
functions
Test #1: last 2 hours; does not span midnight; file spans midnight
$ cat sample.log
22:22:00.896232 IP 104.16.42.63.https ignore this line
06:22:00.896232 IP 104.16.42.63.https ignore this line; crossed midnight
07:22:00.896232 IP 104.16.42.63.https ignore this line
09:23:00.896232 IP 104.16.42.63.https ignore this line
09:51:49.896232 IP 104.16.42.63.https ignore this line
09:51:50.896232 IP 104.16.42.63.https keep this line
10:24:37.896232 IP 104.16.42.63.https keep this line
11:51:50.896232 IP 104.16.42.63.https keep this line
$ offset_secs=$((2*60*60)) # 2 hours
$ awk -v offset_secs="${offset_secs}" -f log.awk <(tail -1 sample.log) sample.log
09:51:50.896232 IP 104.16.42.63.https keep this line
10:24:37.896232 IP 104.16.42.63.https keep this line
11:51:50.896232 IP 104.16.42.63.https keep this line
Test #2: last 4 hours; spans midnight; file spans multiple midnights
$ cat sample.log
20:22:00.896232 IP 104.16.42.63.https ignore this line
23:22:00.896232 IP 104.16.42.63.https ignore this line
01:22:00.896232 IP 104.16.42.63.https ignore this line; crossed midnight
23:22:00.896232 IP 104.16.42.63.https ignore this line
01:22:00.896232 IP 104.16.42.63.https ignore this line; crossed midnight
06:22:00.896232 IP 104.16.42.63.https ignore this line
07:22:00.896232 IP 104.16.42.63.https ignore this line
09:23:00.896232 IP 104.16.42.63.https ignore this line
22:51:49.896232 IP 104.16.42.63.https ignore this line
22:51:50.896232 IP 104.16.42.63.https keep this line
23:07:37.896232 IP 104.16.42.63.https keep this line
00:51:50.896232 IP 104.16.42.63.https keep this line; crossed midnight
01:24:37.896232 IP 104.16.42.63.https keep this line
02:51:50.896232 IP 104.16.42.63.https keep this line
$ offset_secs=$((4*60*60)) # 4 hours
$ awk -v offset_secs="${offset_secs}" -f log.awk <(tail -1 sample.log) sample.log
22:51:50.896232 IP 104.16.42.63.https keep this line
23:07:37.896232 IP 104.16.42.63.https keep this line
00:51:50.896232 IP 104.16.42.63.https keep this line; crossed midnight
01:24:37.896232 IP 104.16.42.63.https keep this line
02:51:50.896232 IP 104.16.42.63.https keep this line