I have this fluentd
configuration:
<source>
@type tail
<parse>
@type regexp
expression /^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] \"(?<method>\w ) (?<path>[^ ]*) (?<http>[^ ]*)" (?<status_code>[^ ]*) (?<size>[^ ]*)(?:\s"(?<referer>[^\"]*)") "(?<agent>[^\"]*)" (?<urt>[^\"]*).*/
time_format %d/%b/%Y:%H:%M:%S %z
keep_time_key true
types size:integer,reqtime:float,uct:float,uht:float,urt:float
</parse>
path /var/log/nginx/access.log
pos_file /tmp/fluent_nginx.pos
tag nginx
</source>
My log format:
193.137.78.17 - - [07/Jan/2023:09:21:59 0000] "GET /net/api/employee HTTP/1.1" 200 2323 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36" 0.014
193.137.78.17 - - [07/Jan/2023:09:22:00 0000] "GET /net/api/employee HTTP/1.1" 200 2323 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36" 0.005
I've tested my regex on regex101 and works without problems. Still, I get a no patterns matched warning on fluentd. I don't understand why the log isn't parsed correctly.
Jan 07 09:26:26 srv-api fluentd[14878]: 2023-01-07 09:26:26 0000 [warn]: #0 no patterns matched tag="nginx"
Can anyone help me, please? Thanks!
CodePudding user response:
I think your problem is leading spaces in the log
Your pattern is insisting that the <remote>
has no spaces before it, but you do have 4 spaces in your log before the remote IP.
The simplest way, to my mind, is to insert an optional variable-number-of-spaces at the beginning.
^( )*(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] \"(?<method>\w ) (?<path>[^ ]*) (?<http>[^ ]*)" (?<status_code>[^ ]*) (?<size>[^ ]*)(?:\s"(?<referer>[^\"]*)") "(?<agent>[^\"]*)" (?<urt>[^\"]*).*
How it works
The (
and )
are just to make life easier for people reading the code: they will see that between them is a space character, which they might not otherwise notice.
The *
means 0 or more of these.
This allows 0 or more spaces at the beginning of the line to be matched and discarded.
Incidentally
I noticed you are sometimes escaping "
with \
and sometimes not. Is there a reason for this?
CodePudding user response:
You should directly use the nginx parser plugin instead.
Here is a complete working example with the sample input plugin and the nginx parser plugin:
fluent-nginx-test.conf
<source>
@type sample
sample [
{ "message": "193.137.78.17 - - [07/Jan/2023:09:22:00 0000] \"GET /net/api/employee HTTP/1.1\" 200 2323 \"-\" \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36\" 0.005" },
{ "message": "193.137.78.18 - - [07/Jan/2023:09:22:00 0000] \"GET /net/api/employee HTTP/1.1\" 200 2323 \"-\" \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36\" 0.005" }
]
rate 1
size 2
tag nginx
</source>
<filter nginx>
@type parser
key_name message
<parse>
@type nginx
</parse>
</filter>
<match nginx>
@type stdout
</match>
Run
fluentd -c ./fluent-nginx-test.conf
Output
2023-01-07 14:22:00.000000000 0500 nginx: {"remote":"193.137.78.17","host":"-","user":"-","method":"GET","path":"/net/api/employee","code":"200","size":"2323","referer":"-","agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36","http_x_forwarded_for":"0.005"}
2023-01-07 14:22:00.000000000 0500 nginx: {"remote":"193.137.78.18","host":"-","user":"-","method":"GET","path":"/net/api/employee","code":"200","size":"2323","referer":"-","agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36","http_x_forwarded_for":"0.005"}
Environment
fluentd
$ fluentd --version
fluentd 1.12.3
- OS
$ lsb_release -d
Description: Ubuntu 18.04.6 LTS