I have a file called a.txt
that have,
time="2022-08-02T15:07:53 05:30" level=info msg="\x1b[32m\x1b[1mPUBLIC\x1b[39m\x1b[0m http://some.s3-ap-southeast-2.amazonaws.com/ (\x1b[33mhttp://some.com\x1b[39m)"
time="2022-08-02T15:07:53 05:30" level=info msg="\x1b[31m\x1b[1mFORBIDDEN\x1b[39m\x1b[0m http://some.s3.amazonaws.com (\x1b[33mhttp://some.com\x1b[39m)"
time="2022-08-02T15:07:54 05:30" level=info msg="\x1b[31m\x1b[1mFORBIDDEN\x1b[39m\x1b[0m http://some.s3.amazonaws.com (\x1b[33mhttp://some.com\x1b[39m)"
time="2022-08-02T15:07:58 05:30" level=info msg="\x1b[31m\x1b[1mFORBIDDEN\x1b[39m\x1b[0m http://some-assets.s3.amazonaws.com (\x1b[33mhttp://some.com\x1b[39m)"
time="2022-08-02T15:08:01 05:30" level=info msg="\x1b[31m\x1b[1mFORBIDDEN\x1b[39m\x1b[0m http://some.s3.amazonaws.com (\x1b[33mhttp://some.com\x1b[39m)"
I want this output
PUBLIC http://some.s3-ap-southeast-2.amazonaws.com
FORBIDDEN http://some.s3.amazonaws.com
FORBIDDEN http://some.s3.amazonaws.com
FORBIDDEN http://some-assets.s3.amazonaws.com
FORBIDDEN http://some.s3.amazonaws.com
I tried this
cat a.txt | cut -d "=" -f4- | cut -d "[" -f3- | cut -d "m" -f2- | awk -F '\\.amazonaws.com' '{print $1".amazonaws.com"}'
This is working but, I'm not able to remove \x1b[39m\x1b[0m
CodePudding user response:
Using sed
$ sed -E 's~([^[]*\[){2}[^A-Z]*([^\]*)[^ ]* ([^ ]*\.[a-z] ).*~\2 \3~' input_file | column -t
PUBLIC http://some.s3-ap-southeast-2.amazonaws.com
FORBIDDEN http://some.s3.amazonaws.com
FORBIDDEN http://some.s3.amazonaws.com
FORBIDDEN http://some-assets.s3.amazonaws.com
FORBIDDEN http://some.s3.amazonaws.com
CodePudding user response:
You may use this awk
solution:
awk -F= '{gsub(/^.*1m|\/? \(.*$|\\x[^[:blank:]]*/, "", $4); print $4}' file | column -t
PUBLIC http://some.s3-ap-southeast-2.amazonaws.com
FORBIDDEN http://some.s3.amazonaws.com
FORBIDDEN http://some.s3.amazonaws.com
FORBIDDEN http://some-assets.s3.amazonaws.com
FORBIDDEN http://some.s3.amazonaws.com
Use column -t
for formatting of output.
CodePudding user response:
With your shown samples please try following awk
code, written and tested in GNU awk
. This is a GNU awk
column
command's combination. Code is using match
function of awk
to get the matched sub string as per required output.
awk '
BEGIN{ OFS="\t" }
match($0,/^time=".*level=\S \smsg="[^[]*\[[^[]*\[1m([^\\]*)\\x1b\S \s(https?:\/\/\S )/,arr){
print arr[1],arr[2]
}
' Input_file | column -t -s $'\t'