We followed these instructions to set up DataDog in our Kubernetes 1.22 cluster, using their operator. This was installed via helm with no customisations.
The operator, cluster-agent, and per-node agent pods are all running as expected. We know that the agents are able to communicate successfully with the DataDog endpoint because our new cluster shows up in the Infrastructure List view of DataDog.
However, logs from our application's pods aren't appearing in DataDog and we're struggling to figure out why.
Some obvious things we made sure to confirm:
agent.log.enabled
is true in our agent spec (full YAML included below).- our application pods' logs are present in
/var/log/pods/
, and contain the log lines we were expecting. - the DataDog agent is able to see these log files.
So it seems that something is going wrong in between the agent and the logs being available in the DataDog UI. Does anyone have any ideas for how to debug this?
Configuration of our agents:
apiVersion: datadoghq.com/v1alpha1
kind: DatadogAgent
metadata:
name: datadog
namespace: datadog
spec:
agent:
apm:
enabled: false
config:
tolerations:
- operator: Exists
image:
name: "gcr.io/datadoghq/agent:latest"
log:
enabled: true
process:
enabled: false
processCollectionEnabled: false
clusterAgent:
config:
admissionController:
enabled: true
mutateUnlabelled: true
clusterChecksEnabled: true
externalMetrics:
enabled: true
image:
name: "gcr.io/datadoghq/cluster-agent:latest"
replicas: 1
clusterChecksRunner: {}
credentials:
apiSecret:
keyName: api-key
secretName: datadog-secret
appSecret:
keyName: app-key
secretName: datadog-secret
features:
kubeStateMetricsCore:
enabled: false
logCollection:
enabled: true
orchestratorExplorer:
enabled: false
Here are the environment variables for one of the DataDog agents:
DD_API_KEY : secretKeyRef(datadog-secret.api-key)
DD_CLUSTER_AGENT_AUTH_TOKEN : secretKeyRef(datadog.token)
DD_CLUSTER_AGENT_ENABLED : true
DD_CLUSTER_AGENT_KUBERNETES_SERVICE_NAME : datadog-cluster-agent
DD_COLLECT_KUBERNETES_EVENTS : false
DD_DOGSTATSD_ORIGIN_DETECTION : false
DD_DOGSTATSD_SOCKET : /var/run/datadog/statsd/statsd.sock
DD_EXTRA_CONFIG_PROVIDERS : clusterchecks endpointschecks
DD_HEALTH_PORT : 5555
DD_KUBERNETES_KUBELET_HOST : fieldRef(v1:status.hostIP)
DD_LEADER_ELECTION : false
DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL : false
DD_LOGS_CONFIG_K8S_CONTAINER_USE_FILE : true
DD_LOGS_ENABLED : true
DD_LOG_LEVEL : INFO
KUBERNETES : yes
CodePudding user response:
if you are able to see metrics, then for logs I can see two possible reason
- enable logs collection during helm installation
helm upgrade -i datadog --set datadog.apiKey=mykey datadog/datadog --set datadog.logs.enabled=true
- Wrong region configuration, by default it expects
US
.
helm upgrade -i datadog --set datadog.apiKey=my-key datadog/datadog --set datadog.site=us5.datadoghq.com
If these two are correct, make sure the pod write logs stdout/sterror
as the log path seems correct as default
- name: logpodpath
mountPath: /var/log/pods
mountPropagation: None
Apart from that, you also need to white list the container list to collect log from, or you can set the below ENV to true and it should work and collect all logs.
DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL=true