how to extract part of log file bash-CodePudding

I have a log file,

10.1.1.10 arcesium.com [17/Dec/2018:08:05:32  0000] "GET /api/v1/services HTTP/1.1" 200 4081 "http://www. example.com/" "Mozilla/5.0 (X11; Linux x86_64; rv:25.0) Gecko/20100101 Firefox/25.0"
10.1.1.11 arcesium.com [17/Dec/2018:08:05:32  0000] "GET /api/v1/services HTTP/1.1" 200 4084 "http://www. example.com/" "Mozilla/5.0 (X11; Linux x86_64; rv:25.0) Gecko/20100101 Firefox/25.0"
10.1.1.13 arcesium.com [17/Dec/2018:08:05:32  0000] "GET /api/v1/services HTTP/1.1" 200 4082 "http://www. example.com/" "Mozilla/5.0 (X11; Linux x86_64; rv:25.0) Gecko/20100101 Firefox/25.0"

I want to get the 9th field as,

awk '{print $9}' file.txt
4081
4084
4082

But the problem is if the 3rd column got one more space "[17/Dec/2018:08:05:32 0000]", then my value position will change to 10th column.

How can I achieve to combine the single value fields irrespective of space between them.

I want to achieve this using awk.

CodePudding user response：

Using awk

$ awk -F"\"" '{$3=substr($3,6,4);print $3}' input_file
4081
4084
4082

CodePudding user response：

You can use in gnu-awk FPAT, splitting by content

awk 'BEGIN{FPAT = "([^ ] )|(\"[^\"] \")|(\\[[^\\]] \\])" } {print $6}' file.txt

you get,

4081
4084
4082

For column 1,

awk 'BEGIN{FPAT = "([^ ] )|(\"[^\"] \")|(\\[[^\\]] \\])" } {print $1}' file.txt

you get,

10.1.1.10
10.1.1.11
10.1.1.13

For column 3, for example

awk 'BEGIN{FPAT = "([^ ] )|(\"[^\"] \")|(\\[[^\\]] \\])" } {print $3}' file.txt

you get,

[17/Dec/2018:08:05:32  0000]
[17/Dec/2018:08:05:32  0000]
[17/Dec/2018:08:05:32  0000]

for column 4, for example

awk 'BEGIN{FPAT = "([^ ] )|(\"[^\"] \")|(\\[[^\\]] \\])" } {print $4}' file.txt

you get,

"GET /api/v1/services HTTP/1.1"
"GET /api/v1/services HTTP/1.1"
"GET /api/v1/services HTTP/1.1"