Home > Net >  Grep sorting dates
Grep sorting dates

Time:04-12

Each day I have to manually identify each stuck call by looking for calls with a date older than the current day. I have managed to grep the required fields to identify the calls in question.

grep -e start -e instance stuckcdr.txt


CDR instance 153 [] > :
                start                    [12:04:2022][10:07:09:968]
CDR instance 200 [] > :
                start                    [12:04:2022][10:05:56:991]
CDR instance 209 [] > :
                start                    [12:04:2022][09:55:55:358]
CDR instance 216 [] > :
                start                    [12:04:2022][10:05:40:443]
CDR instance 218 [] > :
                start                    [12:04:2022][10:07:44:084]
CDR instance 221 [] > :
                start                    [12:04:2022][10:08:11:690]
CDR instance 222 [] > :
                start                    [12:04:2022][09:52:47:846]
CDR instance 223 [] > :
                start                    [07:04:2022][12:28:03:858]
CDR instance 225 [] > :
                start                    [12:04:2022][10:02:40:345]
CDR instance 226 [] > :
                start                    [12:04:2022][10:07:58:530]
CDR instance 227 [] > :
                start                    [03:04:2022][17:53:16:771]
CDR instance 231 [] > :
                start                    [12:04:2022][10:06:19:830]
CDR instance 234 [] > :
                start                    [12:04:2022][10:06:06:937]
CDR instance 237 [] > :
                start                    [04:04:2022][08:55:03:575]
CDR instance 238 [] > :
                start                    [07:04:2022][12:28:15:537]
CDR instance 242 [] > :
                start                    [12:04:2022][10:05:18:753]
CDR instance 243 [] > :
                start                    [07:04:2022][12:23:38:303]
CDR instance 244 [] > :
                start                    [12:04:2022][10:01:40:195]
CDR instance 245 [] > :
                start                    [12:04:2022][10:08:33:821]
CDR instance 246 [] > :
                start                    [12:04:2022][09:53:03:281]
CDR instance 247 [] > :
                start                    [12:04:2022][09:42:06:561]
CDR instance 248 [] > :
                start                    [12:04:2022][10:04:49:953]
CDR instance 249 [] > :
                start                    [12:04:2022][10:07:29:250]
CDR instance 250 [] > :
                start                    [12:04:2022][10:01:33:905]
CDR instance 253 [] > :
                start                    [12:04:2022][09:55:48:996]
CDR instance 254 [] > :
                start                    [07:04:2022][12:27:55:402]
CDR instance 255 [] > :
                start                    [12:04:2022][10:04:38:088]
CDR instance 256 [] > :
                start                    [12:04:2022][09:42:47:932]
CDR instance 258 [] > :
                start                    [12:04:2022][09:57:16:372]
CDR instance 259 [] > :
                start                    [12:04:2022][09:46:35:323]
CDR instance 260 [] > :
                start                    [12:04:2022][10:05:19:144]
CDR instance 262 [] > :
                start                    [12:04:2022][09:52:56:531]
CDR instance 263 [] > :
                start                    [12:04:2022][10:07:50:331]

Can I also filter the data by start date older than the current date?

It would massively speed up the process :)

Thanks

CodePudding user response:

You can get closer by asking grep to exclude lines that match today's date; this assumes you don't have any records for future timestamps!

grep -e start -e instance stuckcdr.txt | grep -v "$(date  '%d:%m:%Y')"

This will leave the "instance" lines behind (that correspond to today), but it should be visually easier to find the historic dates & data.

CodePudding user response:

Using awk:

LC_ALL=C awk -v date="$(date ' [%d:%m:%Y][00:00:00:000]')" '
$2=="instance"{instance=$0}
$1=="start" && $2<date {$1=$1; print instance,$0}' stuckcdr.txt

This prints entries from before the start of the current day.

We use lexical sorting to compare date strings. This is possible due to each field (month, hour, etc) being the same length. LC_ALL=C guarantees consistent behaviour in different locales.

To print entries older than 'now' (when the script is run, to the nearest second, as opposed to older than 'today'), use this date syntax: date="$(date ' [%d:%m:%Y][%H:%M:%S:000]')".

$1=$1 just trims the whitespace. I also print instance number and date on the same line, for readability.

Example output:

CDR instance 223 [] > : start [07:04:2022][12:28:03:858]
CDR instance 227 [] > : start [03:04:2022][17:53:16:771]
CDR instance 237 [] > : start [04:04:2022][08:55:03:575]
CDR instance 238 [] > : start [07:04:2022][12:28:15:537]
CDR instance 243 [] > : start [07:04:2022][12:23:38:303]
CDR instance 254 [] > : start [07:04:2022][12:27:55:402]

You can also sort output by piping to sort -k 8,8 (sort by date) or sort -n 3,3 (sort by instance number).

CodePudding user response:

Would you please try the combination of GNU sed, sort and other commands:

#!/bin/bash

today=$(date  %F)
sed -E '/^CDR/N;s/\n//' input_file.txt | sed -E 's/(^[^][] \[][^][] \[([0-9]{2}):([0-9]{2}):([0-9]{4}).*)/\4-\3-\2\t\1/' | sort -k1,1 | sed "/^$today/{d;q}" | cut -f2- | sed 's/> :/\n/'

Result:

CDR instance 227 [] 
                start                    [03:04:2022][17:53:16:771]
CDR instance 237 [] 
                start                    [04:04:2022][08:55:03:575]
CDR instance 223 [] 
                start                    [07:04:2022][12:28:03:858]
CDR instance 238 [] 
                start                    [07:04:2022][12:28:15:537]
CDR instance 243 [] 
                start                    [07:04:2022][12:23:38:303]
CDR instance 254 [] 
                start                    [07:04:2022][12:27:55:402]
  • First assign today to the today's string in the format YYYY-MM-DD.
  • sed -E '/^CDR/N;s/\n//' merges two lines to make the sort easy.
  • sed -E 's/(^[^][] \[][^][] \[([0-9]{2}):([0-9]{2}):([0-9]{4}).*)/\4-\3-\2\t\1/' extracts the date field, then prepend the date string to the line rearranging the format to YYYY-MM-DD.
  • sort -k1,1 sorts the lines by date, older first, newer last.
  • sed "/^$today/{d;q}" prints the lines older than today, exclusive.
  • cut -f2- removes the date field preceding the line.
  • sed 's/> :/\n/' splits the line into original two lines.
  • Related