Home > Net >  containerd multiline logs parsing with fluentbit
containerd multiline logs parsing with fluentbit

Time:07-29

After shifting from Docker to containerd as docker engine used by our kubernetes, we are not able to show the multiline logs in a proper way by our visualization app (Grafana) as some details prepended to the container/pod logs by the containerd itself (i.e. timestamp, stream & log severity to be specific it is appending something like the following and as shown in the below sample: 2022-07-25T06:43:17.20958947Z stdout F  ) which make some confusion for the developers and the application owners.

I am showing here a dummy sample of the logs generated by the application and how it got printed in the nodes of kuberenetes'nodes after containerd prepended the mentioned details.

The following logs generated by the application (kubectl logs ):

2022-07-25T06:43:17,309ESC[0;39m dummy-[txtThreadPool-2] ESC[39mDEBUGESC[0;39m
  ESC[36mcom.pkg.sample.ComponentESC[0;39m - Process message meta {
  timestamp: 1658731397308720468
  version {
      major: 1
      minor: 0
      patch: 0
  }
}

when I check the logs in the filesystem (/var/log/container/ABCXYZ.log) :

2022-07-25T06:43:17.20958947Z stdout F 2022-07-25T06:43:17,309ESC[0;39m dummy-[txtThreadPool-2]
ESC[39mDEBUGESC[0;39m
ESC[36mcom.pkg.sample.ComponentESC[0;39m - Process message meta {
2022-07-25T06:43:17.20958947Z stdout F timestamp: 1658731449723010774
2022-07-25T06:43:17.209593379Z stdout F version {
2022-07-25T06:43:17.209595933Z stdout F major: 14
2022-07-25T06:43:17.209598466Z stdout F minor: 0
2022-07-25T06:43:17.209600712Z stdout F patch: 0
2022-07-25T06:43:17.209602926Z stdout F }
2022-07-25T06:43:17.209605099Z stdout F }

I am able to parse the multiline logs with fluentbit but the problem is I am not able to remove the details injected by containerd ( >> 2022-07-25T06:43:17.209605099Z stdout F .......). So is there anyway to configure containerd to not prepend these details somehow in the logs and print them as they are generated from the application/container ?

On the other hand is there any plugin to remove such details from fluentbit side .. as per the existing plugins none of them can manipulate or change the logs (which is logical  as the log agent should not do any change on the logs).

Thanks in advance.

CodePudding user response:

From the configuration options for containerd, it appears there's no way to configure logging in any way. You can see the config doc here.

Also, I checked at the logging code inside containerd and it appears this is prepended to the logs as they are redirected from the stdout of the container. You can see that the testcase here checks for appropriate fields by "splitting" the log line received. It checks for a tag and a stream entry prepended to the content of the log. I suppose that's the way logs are processed in containerd.

The best thing to do would be to open an issue in the project with your design requirement and perhaps the team can develop configurable stdout redirection for you.

CodePudding user response:

This is the workaround I followed to show the multiline log lines in Grafana by applying extra fluentbit filters and multiline parser.

1- First I receive the stream by tail input which parse it by a multiline parser (multilineKubeParser).

2- Then another filter will intercept the stream to do further processing by a regex parser (kubeParser).

3- After that another filter will remove the details added by the containerd by a lua parser ().

  fluent-bit.conf: |-
    [SERVICE]
        HTTP_Server    On
        HTTP_Listen    0.0.0.0
        HTTP_PORT      2020
        Flush          1
        Daemon         Off
        Log_Level      warn
        Parsers_File   parsers.conf
    [INPUT]
        Name           tail
        Tag            kube.*
        Path           /var/log/containers/*.log
        multiline.Parser         multilineKubeParser
        Exclude_Path   /var/log/containers/*_ABC-logging_*.log
        DB             /run/fluent-bit/flb_kube.db
        Mem_Buf_Limit  5MB
    [FILTER]
        Name           kubernetes
        Match          kube.*
        Kube_URL       https://kubernetes.default.svc:443
        Merge_Log On
        Merge_Parser   kubeParser
        K8S-Logging.Parser Off
        K8S-Logging.Exclude On
    [FILTER]
        Name           lua
        Match          kube.*
        call remove_dummy
        Script filters.lua
    [Output]
        Name grafana-loki
        Match kube.*
        Url http://loki:3100/api/prom/push
        TenantID ""
        BatchWait 1
        BatchSize 1048576
        Labels {job="fluent-bit"}
        RemoveKeys kubernetes
        AutoKubernetesLabels false
        LabelMapPath /fluent-bit/etc/labelmap.json
        LineFormat json
        LogLevel warn
  labelmap.json: |-
    {
      "kubernetes": {
        "container_name": "container",
        "host": "node",
        "labels": {
          "app": "app",
          "release": "release"
        },
        "namespace_name": "namespace",
        "pod_name": "instance"
      },
      "stream": "stream"
    }
  parsers.conf: |-
    [PARSER]
        Name        kubeParser
        Format      regex
        Regex       /^([^ ]*).* (?<timeStamp>[^a].*) ([^ ].*)\[(?<requestId>[^\]]*)\] (?<severity>[^ ]*) (?<message>[^ ].*)$/
        Time_Key    time
        Time_Format %Y-%m-%dT%H:%M:%S.%L%z
        Time_Keep   On
        Time_Offset  0200
    [MULTILINE_PARSER]
        name          multilineKubeParser
        type          regex
        flush_timeout 1000           
        rule      "start_state"   "/[^ ]* stdout .\s \W*\w \d\d\d\d-\d\d-\d\d \d\d\:\d\d\:\d\d,\d\d\d.*$/"  "cont"
        rule      "cont"          "/[^ ]* stdout .\s (?!\W \w \d\d\d\d-\d\d-\d\d \d\d\:\d\d\:\d\d,\d\d\d).*$/"   "cont"

  filters.lua: |-

    function remove_dummy(tag, timestamp, record)
       new_log=string.gsub(record["log"],"%d -%d -%d T%d :%d :%d .%d Z%sstdout%sF%s","")
       new_record=record
       new_record["log"]=new_log
       return 2, timestamp, new_record
    end

As I mentioned this is a workaround till I can find any other/better solution.

  • Related