I've a use-case to find records from application log file that contains specific keywords.
I've tried this using grep
but it uses \n
as a line separator and hence the logs(with \n
in the messages) are partially fetched.
A sample application log file(all of them are separate lines,(in other words) with \n
at the end) :
2017-11-22 01:43:36 LogManager : Currently processing data {Name: Hello}
Fetching last name
{LastName : World}
2017-11-22 03:12:23 LogManager : Currently processing data {Name: Dummy}
Fetching last name
{LastName : Value}
SomeRandomMessage
2017-11-22 03:12:23 LogManager : SomeRandomMessage
Currently processing data {Name: Dummy2}
Fetching last name
SomeRandomMessage
{LastName : Value3}
.
.
.
.
I want to use YYYY-MM-DD HH:MM:SS
as a record separator and then within records, find if it contains Hello
and World
(for example).
Expected output :
2017-11-22 01:43:36 LogManager : Currently processing data {Name: Hello}
Fetching last name
{LastName : World}
What I've tried :
grep 'Hello' fileName
>>
2017-11-22 01:43:36 LogManager : Currently processing data {Name: Hello}
CodePudding user response:
Using any POSIX awk:
$ cat tst.awk
/^[0-9]{4}(-[0-9]{2}){2} [0-9]{2}(:[0-9]{2}){2} / {
prt()
rec = $0
next
}
{ rec = rec ORS $0 }
END {
prt()
}
function prt() {
if ( (rec ~ /Hello/) && (rec ~ /World/) ) {
print rec
}
}
$ awk -f tst.awk file
2017-11-22 01:43:36 LogManager : Currently processing data {Name: Hello}
Fetching last name
{LastName : World}
CodePudding user response:
I want to use
YYYY-MM-DD HH:MM:SS
as a record separator
You may use this gnu-awk
command:
awk -v RS='[0-9]{4}(-[0-9]{2}){2} ([0-9]{2}:){2}[0-9]{2}' '
/Hello/ && /World/ {printf "%s", RT $0}' file
2017-11-22 01:43:36 LogManager : Currently processing data {Name: Hello}
Fetching last name
{LastName : World}
Here -v RS='[0-9]{4}(-[0-9]{2}){2} ([0-9]{2}:){2}[0-9]{2}'
will set record separator to date-time string and when we match Hello
and World
a record is printed after RT
i.e. record separator text.
CodePudding user response:
UPDATE - original answer (see edit/revision) made a few assumptions based on OP's sample input; OP has since stated (per comments) that the sample input is not representative of an actual log file ...
Per comments from OP:
- search pattern(s) won't necessarily reside in the 1st (and/or last) line of a log entry
- a log entry may have a variable number of lines
- cannot rely on the string
LastName
being on the last line of the log entry (and at this point I'm going to assume a log entry may not even contain the stringLastName
)
Assumptions:
- a log entry will always start with a date of the format
YYYY-MM-DD
as the first field of the first line of said log entry - a search pattern will not span multiple lines
- need to support up to 2 search patterns (more could be added with a redesign)
Adding some additional sample data:
$ cat fileName
2017-11-22 01:43:36 LogManager : Currently processing data {Name: Hello}
Fetching last name
{LastName : World}
last line for this entry
2017-11-22 03:12:23 LogManager : Currently processing data {Name: Dummy}
Fetching last name
{LastName : Value}
last line?
nope, this is the last line
2017-11-22 05:17:33 LogManager : Currently processing data {Name: Dummy2}
Fetching last name
{LastName : Value3}
2017-11-22 12:13:02 LogManager : Currently processing data {Name: WhoYaCalinDummy}
Fetching last name
{LastName : WhoMe}
One awk
idea:
findme1='Hello'
findme2='World'
awk -v ptn1="${findme1}" -v ptn2="${findme2}" '
function test_and_print() {
if (log_entry ~ ptn1 && log_entry ~ ptn2) # if ptn1/ptn2 show up anywhere in our log entry then ...
print log_entry # print it to stdout
log_entry="" # reset our variable
}
BEGIN { date_regex="[0-9]{4}-[0-9]{2}-[0-9]{2}" }
$1 ~ date_regex { test_and_print() } # new log entry so test the previous entry
{ log_entry= log_entry (log_entry ? RS : "") $0 } # append current line to current log entry
END { test_and_print() } # test the last entry
' fileName
For findme1='Hello'; findme2='World'
this generates:
2017-11-22 01:43:36 LogManager : Currently processing data {Name: Hello}
Fetching last name
{LastName : World}
last line for this entry
For findme1='Hello'; findme2='Bob'
this generates:
# no output
For findme1='WhoMe'; findme2=''
this generates:
2017-11-22 12:13:02 LogManager : Currently processing data {Name: WhoYaCalinDummy}
Fetching last name
{LastName : WhoMe}
For findme1='XXXX'; findme2=''
this generates:
# no output
CodePudding user response:
Since these are Logs which are dealing with and you mentioned they are in same format, if this is the case then try following code in GNU awk
. Here is the Online Demo for used regex((^|\n)[0-9]{4}(-[0-9]{2}){2} ([0-9]{2}:){2}[0-9]{2} LogManager : [^{]*{Name: Hello}\nFetching last name\n{LastName : World
) in GNU awk
code.
awk -v RS="" '
{while(match($0,/(^|\n)[0-9]{4}(-[0-9]{2}){2} \
([0-9]{2}:){2}[0-9]{2} LogManager : \
[^{]*{Name: Hello}\nFetching last name\n\
{LastName : World/)){
print substr($0,RSTART,RLENGTH)
$0=substr($0,RSTART RLENGTH)
}
}
' Input_file