Home > Mobile >  Fluentd on Kubernetes - Parse Nginx Access Log in Json
Fluentd on Kubernetes - Parse Nginx Access Log in Json

Time:06-22

I currently have this nginx log output.

      log_format json_logs escape=json
                            '{'
                            '"time_local":"$time_local",'
                            '"remote_addr":"$remote_addr",'
                            '"remote_user":"$remote_user",'
                            '"request":"$request",'
                            '"status": "$status",'
                            '"body_bytes_sent":"$body_bytes_sent",'
                            '"request_time":"$request_time",'
                            '"http_referrer":"$http_referer",'
                            '"http_user_agent":"$http_user_agent"'
                            '}';
      access_log /var/log/nginx/access.log json_logs;

However, when outputted and collected by Fluentd it is prefixed with the timestamp and stdout.

For example..

2022-06-18T19:05:15.014296769Z stdout F {\"time_local\":\"18/Jun/2022:19:05:15        0000\",\"remote_addr\":\"10.106.0.5\",\"remote_user\":\"\",\"request\":\"GET /       HTTP/1.1\",\"status\":       \"304\",\"body_bytes_sent\":\"0\",\"request_time\":\"0.000\",\"http_referrer\":\"\",\"htt      p_user_agent\":\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36    (KHTML, like Gecko) Chrome/102.0.5005.99 Safari/537.36\"}

I can't parse it correctly, i temporarily set it to

 <source>
      @type tail
      path /var/log/containers/*nginx*.log
      pos_file /var/log/nginx.log.pos
      tag nginx.access
      <parse>
        @type nginx
        expression ^(?<somenginxstuff>.*)$
        time_key logtime
        time_format %d/%b/%Y:%H:%M:%S.%z
      </parse>
      
    </source>

to dump it all in elastic/kibana so i can check the outputs.

Questions is - what is the best/easiest way to do this? I assume it would be very common usecase?

Also, i've seen mention of plugins and i'm using the base fluent/fluentd-kubernetes-daemonset:v1.14.6-debian-elasticsearch7-1.0 image. How do i add these (if they help)?

Many thanks in advance

CodePudding user response:

I ended up doing this by parsing it with json then filtering the field to parse as json like follows.

For following log output...

2022-06-18T19:05:15.014296769Z stdout F {\"time_local\":\"18/Jun/2022:19:05:15        0000\",\"remote_addr\":\"10.106.0.5\",\"remote_user\":\"\",\"request\":\"GET /       HTTP/1.1\",\"status\":       \"304\",\"body_bytes_sent\":\"0\",\"request_time\":\"0.000\",\"http_referrer\":\"\",\"htt      p_user_agent\":\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36    (KHTML, like Gecko) Chrome/102.0.5005.99 Safari/537.36\"}

And this configuration

<source>
  @type tail
  path /var/log/containers/*nginx*.log
  pos_file /var/log/nginx.log.pos
  tag nginx.access
  <parse>
    @type regexp
    expression ^(?<timestamp>[^ ]*) [^ ]*[ ][^ ] (?<data>.*).*$
   </parse>
</source>

 <filter nginx.access>
  @type parser
  key_name data
  <parse>
    @type json
  </parse>
</filter>
  • Related