I was trying to grep a value by key from log4j file, the script works fine, is there anyway to include the timestamp to the output as well?
My script is:
awk '/duration=/ {print $4}' mylog.log | awk '{print substr($1, 10, length($1)-10)}' | less
the log file is:
WARN [20.07.22 00:02:43.647] TaskManager .process() Processing task took longer than set threshold
|-| |-| delayEvent=DelayEvent{eventName='UpdateUser', duration=65, endTime=1658275363647, averageValue=0, maxValue=637} | delayThreshold=50
|-| nanos=45293222359715481 | threadId=523 | threadName=ThreadManager | timestamp=1658275363647 | mid=nag1uu-1#42616
|-| layer=BAA |-| [ SISS]
WARN [20.07.22 00:02:44.689] TaskManager .process() Processing task took longer than set threshold
|-| |-| delayEvent=DelayEvent{eventName='UpdateUser', duration=88, endTime=1658275364689, averageValue=0, maxValue=637} | delayThreshold=50
|-| nanos=45293223401808770 | threadId=523 | threadName=ThreadManager | timestamp=1658275364689 | mid=nag1uu-1#42616
|-| layer=BAA |-| [ SISS]
output is
65
88
My expected result is:
20.07.22 00:02:43.647 65
20.07.22 00:02:44.689 88
Is there anyway to achieve this? Thanks a lot in advance.
CodePudding user response:
With your shown samples please try following awk
code. Written and tested in GNU awk
.
awk -v RS='\\[([0-9]{2}\\.){2}[0-9]{2} ([0-9]{2}:){2}[0-9]{2}\\.[0-9]{3}[^\n]*\n[^\n]*duration=[0-9] ' '
RT{
num=split(RT,arr,"[][,]")
sub(/duration=/,"",arr[num])
print arr[2],arr[num]
}
' Input_file
OR with only using split
(not using sub
in getting actual values in RT
) try following above code which is minor tweak of above awk
code.
awk -v RS='\\[([0-9]{2}\\.){2}[0-9]{2} ([0-9]{2}:){2}[0-9]{2}\\.[0-9]{3}[^\n]*\n[^\n]*duration=[0-9] ' '
RT{
num=split(RT,arr,"[][,]|duration=")
print arr[2],arr[num]
}
' Input_file
Explanation of regex:
\\[([0-9]{2}\\.){2}[0-9]{2} ##Matching literal [ followed by (2 digits followed by dot)
and this combination 2 times followed by 2 digits.
([0-9]{2}:){2}[0-9]{2}\\.[0-9]{3} ##Matching space followed by (2 digits followed by colon) and
this combination 2 times followed by 2 digits followed by dot followed by 3 digits.
[^\n]*\n[^\n]*duration=[0-9] ##Matching everything until new line comes followed by new line
then match everything before newline till duration= digits as per requirement.
CodePudding user response:
The first Awk script is throwing away that information; but presumably what you want can be obtained by refactoring everything into a single Awk script, like it should have been done in the first place.
awk '/^[^ \t]/ { sub(/^\[/, "", $2); sub(/\]$/, "", $3); when=$2 " " $3}
$4 ~ /^duration=/ {print when "\t" substr($4, 10, length($4)-10)}' mylog.log
CodePudding user response:
I would exploit GNU AWK
's paragraph mode for this task following way, let file.txt
content be
WARN [20.07.22 00:02:43.647] TaskManager .process() Processing task took longer than set threshold
|-| |-| delayEvent=DelayEvent{eventName='UpdateUser', duration=65, endTime=1658275363647, averageValue=0, maxValue=637} | delayThreshold=50
|-| nanos=45293222359715481 | threadId=523 | threadName=ThreadManager | timestamp=1658275363647 | mid=nag1uu-1#42616
|-| layer=BAA |-| [ SISS]
WARN [20.07.22 00:02:44.689] TaskManager .process() Processing task took longer than set threshold
|-| |-| delayEvent=DelayEvent{eventName='UpdateUser', duration=88, endTime=1658275364689, averageValue=0, maxValue=637} | delayThreshold=50
|-| nanos=45293223401808770 | threadId=523 | threadName=ThreadManager | timestamp=1658275364689 | mid=nag1uu-1#42616
|-| layer=BAA |-| [ SISS]
then
awk 'BEGIN{RS="";FS="[\]\[]"}match($0,/duration=[[:digit:]] /){print $2, substr($0, RSTART 9, RLENGTH-9)}' file.txt
gives output
20.07.22 00:02:43.647 65
20.07.22 00:02:44.689 88
Explanation: I set RS
to empty string to activate paragraph mode - now everything between blank lines is considered to be single row and field separator to be literal [
or literal ]
. For every row containing duration=
followed by 1 or more digits I print 2nd field (timestamp) followed by substring, I calucate start of it and length based on where is match, as duration=
has 9 characters I offset start and length by that value.
(tested in gawk 4.2.1)
CodePudding user response:
GNU Awk
Using delimiter =
and ,
to get fields duration=$5,
endTime=$7,
awk -F '[=,]' '
/duration=/{
printf "%s.%s\t%s\n",
# print date.milliseconds<tab>duration
strftime("%d.%m.%y %H:%M:%S", substr($7,1,length($7)-3),1),
# or strftime("%d.%m.%y %H:%M:%S", $7/1000, 1),
# convert Timestamp to date
substr($7,length($7)-2),$5
# add milliseconds & duration
}' mylog.log
20.07.22 00:02:43.647 65
20.07.22 00:02:44.689 88
Other..
awk -F '[\\[\\]=,]' '/TaskManager/{tsp=$2}/duration/{print tsp"\t"$5}' mylog.log
20.07.22 00:02:43.647 65
20.07.22 00:02:44.689 88
CodePudding user response:
Using any awk:
$ awk -v RS= -F'[][,= ] ' '{print $3, $4, $20}' mylog.log
20.07.22 00:02:43.647 65
20.07.22 00:02:44.689 88
If that's not all you need then edit your question to provide more realistic sample input.