I have a pod running RabbitMQ. Below is the deployment manifest:
apiVersion: v1
kind: Service
metadata:
name: service-rabbitmq
spec:
selector:
app: service-rabbitmq
ports:
- port: 5672
targetPort: 5672
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: deployment-rabbitmq
spec:
selector:
matchLabels:
app: deployment-rabbitmq
template:
metadata:
labels:
app: deployment-rabbitmq
spec:
containers:
- name: rabbitmq
image: rabbitmq:latest
volumeMounts:
- name: rabbitmq-data-volume
mountPath: /var/lib/rabbitmq
resources:
requests:
cpu: 250m
memory: 128Mi
limits:
cpu: 750m
memory: 256Mi
volumes:
- name: rabbitmq-data-volume
persistentVolumeClaim:
claimName: rabbitmq-pvc
When I deploy it in my local cluster, I see the pod running for a while and then crashing afterwards. So basically it goes under crash-loop. Following is the logs I got from the pod:
$ kubectl logs deployment-rabbitmq-649b8479dc-kt9s4
2021-10-14 06:46:36.182390 00:00 [info] <0.222.0> Feature flags: list of feature flags found:
2021-10-14 06:46:36.221717 00:00 [info] <0.222.0> Feature flags: [ ] implicit_default_bindings
2021-10-14 06:46:36.221768 00:00 [info] <0.222.0> Feature flags: [ ] maintenance_mode_status
2021-10-14 06:46:36.221792 00:00 [info] <0.222.0> Feature flags: [ ] quorum_queue
2021-10-14 06:46:36.221813 00:00 [info] <0.222.0> Feature flags: [ ] stream_queue
2021-10-14 06:46:36.221916 00:00 [info] <0.222.0> Feature flags: [ ] user_limits
2021-10-14 06:46:36.221933 00:00 [info] <0.222.0> Feature flags: [ ] virtual_host_metadata
2021-10-14 06:46:36.221953 00:00 [info] <0.222.0> Feature flags: feature flag states written to disk: yes
2021-10-14 06:46:37.018537 00:00 [noti] <0.44.0> Application syslog exited with reason: stopped
2021-10-14 06:46:37.018646 00:00 [noti] <0.222.0> Logging: switching to configured handler(s); following messages may not be visible in this log output
2021-10-14 06:46:37.045601 00:00 [noti] <0.222.0> Logging: configured log handlers are now ACTIVE
2021-10-14 06:46:37.635024 00:00 [info] <0.222.0> ra: starting system quorum_queues
2021-10-14 06:46:37.635139 00:00 [info] <0.222.0> starting Ra system: quorum_queues in directory: /var/lib/rabbitmq/mnesia/rabbit@deployment-rabbitmq-649b8479dc-kt9s4/quorum/rabbit@deployment-rabbitmq-649b8479dc-kt9s4
2021-10-14 06:46:37.849041 00:00 [info] <0.259.0> ra: meta data store initialised for system quorum_queues. 0 record(s) recovered
2021-10-14 06:46:37.877504 00:00 [noti] <0.264.0> WAL: ra_log_wal init, open tbls: ra_log_open_mem_tables, closed tbls: ra_log_closed_mem_tables
This log isn't helpful too much, I can't find any error message from here. The only useful line here could be Application syslog exited with reason: stopped
, only but it's not as far as I understand. The event log isn't helpful too:
$ kubectl describe pods deployment-rabbitmq-649b8479dc-kt9s4
Name: deployment-rabbitmq-649b8479dc-kt9s4
Namespace: default
Priority: 0
Node: docker-desktop/192.168.65.4
Start Time: Thu, 14 Oct 2021 12:45:03 0600
Labels: app=deployment-rabbitmq
pod-template-hash=649b8479dc
skaffold.dev/run-id=7af5e1bb-e0c8-4021-a8a0-0c8bf43630b6
Annotations: <none>
Status: Running
IP: 10.1.5.138
IPs:
IP: 10.1.5.138
Controlled By: ReplicaSet/deployment-rabbitmq-649b8479dc
Containers:
rabbitmq:
Container ID: docker://de309f94163c071afb38fb8743d106923b6bda27325287e82bc274e362f1f3be
Image: rabbitmq:latest
Image ID: docker-pullable://rabbitmq@sha256:d8efe7b818e66a13fdc6fdb84cf527984fb7d73f52466833a20e9ec298ed4df4
Port: <none>
Host Port: <none>
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: OOMKilled
Exit Code: 0
Started: Thu, 14 Oct 2021 13:56:29 0600
Finished: Thu, 14 Oct 2021 13:56:39 0600
Ready: False
Restart Count: 18
Limits:
cpu: 750m
memory: 256Mi
Requests:
cpu: 250m
memory: 128Mi
Environment: <none>
Mounts:
/var/lib/rabbitmq from rabbitmq-data-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-9shdv (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
rabbitmq-data-volume:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: rabbitmq-pvc
ReadOnly: false
kube-api-access-9shdv:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Pulled 23m (x6 over 50m) kubelet (combined from similar events): Successfully pulled image "rabbitmq:latest" in 4.267310231s
Normal Pulling 18m (x16 over 73m) kubelet Pulling image "rabbitmq:latest"
Warning BackOff 3m45s (x307 over 73m) kubelet Back-off restarting failed container
What could be the reason for this crash-loop?
NOTE:
rabbitmq-pvc
is successfully bound. No issue there.
Update:
This answer indicates that RabbitMQ should be deployed as StatefulSet. So I adjusted the manifest like so:
apiVersion: v1
kind: Service
metadata:
name: service-rabbitmq
spec:
selector:
app: service-rabbitmq
ports:
- name: rabbitmq-amqp
port: 5672
- name: rabbitmq-http
port: 15672
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: statefulset-rabbitmq
spec:
selector:
matchLabels:
app: statefulset-rabbitmq
serviceName: service-rabbitmq
template:
metadata:
labels:
app: statefulset-rabbitmq
spec:
containers:
- name: rabbitmq
image: rabbitmq:latest
volumeMounts:
- name: rabbitmq-data-volume
mountPath: /var/lib/rabbitmq/mnesia
resources:
requests:
cpu: 250m
memory: 128Mi
limits:
cpu: 750m
memory: 256Mi
volumes:
- name: rabbitmq-data-volume
persistentVolumeClaim:
claimName: rabbitmq-pvc
The pod still undergoes crash-loop, but the logs are slightly different.
$ kubectl logs statefulset-rabbitmq-0
2021-10-14 09:38:26.138224 00:00 [info] <0.222.0> Feature flags: list of feature flags found:
2021-10-14 09:38:26.158953 00:00 [info] <0.222.0> Feature flags: [x] implicit_default_bindings
2021-10-14 09:38:26.159015 00:00 [info] <0.222.0> Feature flags: [x] maintenance_mode_status
2021-10-14 09:38:26.159037 00:00 [info] <0.222.0> Feature flags: [x] quorum_queue
2021-10-14 09:38:26.159078 00:00 [info] <0.222.0> Feature flags: [x] stream_queue
2021-10-14 09:38:26.159183 00:00 [info] <0.222.0> Feature flags: [x] user_limits
2021-10-14 09:38:26.159236 00:00 [info] <0.222.0> Feature flags: [x] virtual_host_metadata
2021-10-14 09:38:26.159270 00:00 [info] <0.222.0> Feature flags: feature flag states written to disk: yes
2021-10-14 09:38:26.830814 00:00 [noti] <0.44.0> Application syslog exited with reason: stopped
2021-10-14 09:38:26.830925 00:00 [noti] <0.222.0> Logging: switching to configured handler(s); following messages may not be visible in this log output
2021-10-14 09:38:26.852048 00:00 [noti] <0.222.0> Logging: configured log handlers are now ACTIVE
2021-10-14 09:38:33.754355 00:00 [info] <0.222.0> ra: starting system quorum_queues
2021-10-14 09:38:33.754526 00:00 [info] <0.222.0> starting Ra system: quorum_queues in directory: /var/lib/rabbitmq/mnesia/rabbit@statefulset-rabbitmq-0/quorum/rabbit@statefulset-rabbitmq-0
2021-10-14 09:38:33.760365 00:00 [info] <0.290.0> ra: meta data store initialised for system quorum_queues. 0 record(s) recovered
2021-10-14 09:38:33.761023 00:00 [noti] <0.302.0> WAL: ra_log_wal init, open tbls: ra_log_open_mem_tables, closed tbls: ra_log_closed_mem_tables
The feature flags are now marked as it's seen. No other notable changes. So I still need help.
! New Issue !
Head over here.
CodePudding user response:
The pod gets oomkilled (last state, reason) and you need to assign more resources (memory) to the pod.