I am a newbie in frontend/backend/DevOps. But I am in urgent need of using Kubernetes to deploy an app on Google Cloud Platform (GCP) to provide a service. Then I start learning by following this series of tutorials: https://mickeyabhi1999.medium.com/build-and-deploy-a-web-app-with-react-flask-nginx-postgresql-docker-and-google-kubernetes-e586de159a4d https://medium.com/swlh/build-and-deploy-a-web-app-with-react-flask-nginx-postgresql-docker-and-google-kubernetes-341f3b4de322 And the code of this tutorial series is here: https://github.com/abhiChakra/Addition-App
Everything was fine until the last step: using "gcloud builds submit ..." to build
- nginx react service
- flask wsgi service
- nginx react deployment
- flask wsgi deployment on a GCP cluster.
1.~3. went well and the status of them are "OK". But the status of flask wsgi deployment was "Does not have minimum availability" even after many times of restarting.
I used "kubectl get pods" and saw the status of the flask pod was "CrashLoopBackOff". Then I followed the processes of debugging suggested here: https://containersolutions.github.io/runbooks/posts/kubernetes/crashloopbackoff/
I used "kubectl describe pod flask" to look into the problem of the flask pod. Then I found the "Exit Code" was 139 and there were messages "Liveness probe failed: Get "http://10.24.0.25:8000/health": read tcp 10.24.0.1:55470->10.24.0.25:8000: read: connection reset by peer" and "Readiness probe failed: Get "http://10.24.0.25:8000/ready": read tcp 10.24.0.1:55848->10.24.0.25:8000: read: connection reset by peer".
The complete log:
Name: flask-676d5dd999-cf6kt
Namespace: default
Priority: 0
Node: gke-addition-app-default-pool-89aab4fe-3l1q/10.140.0.3
Start Time: Thu, 11 Nov 2021 19:06:24 0800
Labels: app.kubernetes.io/managed-by=gcp-cloud-build-deploy
component=flask
pod-template-hash=676d5dd999
Annotations: <none>
Status: Running
IP: 10.24.0.25
IPs:
IP: 10.24.0.25
Controlled By: ReplicaSet/flask-676d5dd999
Containers:
flask:
Container ID: containerd://5459b747e1d44046d283a46ec1eebb625be4df712340ff9cf492d5583a4d41d2
Image: gcr.io/peerless-garage-330917/addition-app-flask:latest
Image ID: gcr.io/peerless-garage-330917/addition-app-flask@sha256:b45d25ffa8a0939825e31dec1a6dfe84f05aaf4a2e9e43d35084783edc76f0de
Port: 8000/TCP
Host Port: 0/TCP
State: Running
Started: Fri, 12 Nov 2021 17:24:14 0800
Last State: Terminated
Reason: Error
Exit Code: 139
Started: Fri, 12 Nov 2021 17:17:06 0800
Finished: Fri, 12 Nov 2021 17:19:06 0800
Ready: False
Restart Count: 222
Limits:
cpu: 1
Requests:
cpu: 400m
Liveness: http-get http://:8000/health delay=120s timeout=1s period=5s #success=1 #failure=3
Readiness: http-get http://:8000/ready delay=120s timeout=1s period=5s #success=1 #failure=3
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-s97x5 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
default-token-s97x5:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-s97x5
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Unhealthy 9m7s (x217 over 21h) kubelet (combined from similar events): Liveness probe failed: Get "http://10.24.0.25:8000/health": read tcp 10.24.0.1:48636->10.24.0.25:8000: read: connection reset by peer
Warning BackOff 4m38s (x4404 over 22h) kubelet Back-off restarting failed container
Following the suggestion here: https://containersolutions.github.io/runbooks/posts/kubernetes/crashloopbackoff/#step-4 I had increased the "initialDelaySeconds" to 120, but it still failed.
Because I made sure that everything worked fine on my local laptop, so I think there could be some connection or authentication issue.
To be more detailed, the deployment.yaml looks like:
apiVersion: v1
kind: Service
metadata:
name: ui
spec:
type: LoadBalancer
selector:
app: react
tier: ui
ports:
- port: 8080
targetPort: 8080
---
apiVersion: v1
kind: Service
metadata:
name: flask
spec:
type: ClusterIP
selector:
component: flask
ports:
- port: 8000
targetPort: 8000
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: flask
spec:
replicas: 1
selector:
matchLabels:
component: flask
template:
metadata:
labels:
component: flask
spec:
containers:
- name: flask
image: gcr.io/peerless-garage-330917/addition-app-flask:latest
imagePullPolicy: "Always"
resources:
limits:
cpu: "1000m"
requests:
cpu: "400m"
livenessProbe:
httpGet:
path: /health
port: 8000
initialDelaySeconds: 30
periodSeconds: 5
readinessProbe:
httpGet:
path: /ready
port: 8000
initialDelaySeconds: 30
periodSeconds: 5
ports:
- containerPort: 8000
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: ui
spec:
replicas: 1
selector:
matchLabels:
app: react
tier: ui
template:
metadata:
labels:
app: react
tier: ui
spec:
containers:
- name: ui
image: gcr.io/peerless-garage-330917/addition-app-nginx:latest
imagePullPolicy: "Always"
resources:
limits:
cpu: "1000m"
requests:
cpu: "400m"
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 5
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 30
periodSeconds: 5
ports:
- containerPort: 8080
docker-compose.yaml:
# we will be creating these services
services:
flask:
# Note that we are building from our current terminal directory where our Dockerfile is located, we use .
build: .
# naming our resulting container
container_name: flask
# publishing a port so that external services requesting port 8000 on your local machine
# are mapped to port 8000 on our container
ports:
- "8000:8000"
nginx:
# Since our Dockerfile for web-server is located in react-app foler, our build context is ./react-app
build: ./react-app
container_name: nginx
ports:
- "8080:8080"
Nginx Dockerfile:
# first building react project, using node base image
FROM node:10 as build-stage
# setting working dir inside container
WORKDIR /react-app
# required to install packages
COPY package*.json ./
# installing npm packages
RUN npm install
# copying over react source material
COPY src ./src
# copying over further react material
COPY public ./public
# copying over our nginx config file
COPY addition_container_server.conf ./
# creating production build to serve through nginx
RUN npm run build
# starting second, nginx build-stage
FROM nginx:1.15
# removing default nginx config file
RUN rm /etc/nginx/conf.d/default.conf
# copying our nginx config
COPY --from=build-stage /react-app/addition_container_server.conf /etc/nginx/conf.d/
# copying production build from last stage to serve through nginx
COPY --from=build-stage /react-app/build/ /usr/share/nginx/html
# exposing port 8080 on container
EXPOSE 8080
CMD ["nginx", "-g", "daemon off;"]
Nginx server config:
server {
listen 8080;
# location of react build files
root /usr/share/nginx/html/;
# index html from react build to serve
index index.html;
# ONLY KUBERNETES RELEVANT: endpoint for health checkup
location /health {
return 200 "health ok";
}
# ONLY KUBERNETES RELEVANT: endpoint for readiness checkup
location /ready {
return 200 "ready";
}
# html file to serve with / endpoint
location / {
try_files $uri /index.html;
}
# proxing under /api endpoint
location /api {
client_max_body_size 10m;
add_header 'Access-Control-Allow-Origin' http://<NGINX_SERVICE_ENDPOINT>:8080;
proxy_pass http://flask:8000/;
}
}
There are two important functions in App.js:
...
insertCalculation(event, calculation){
/*
Making a POST request via a fetch call to Flask API with numbers of a
calculation we want to insert into DB. Making fetch call to web server
IP with /api/insert_nums which will be reverse proxied via Nginx to the
Application (Flask) server.
*/
event.preventDefault();
fetch('http://<NGINX_SERVICE_ENDPOINT>:8080/api/insert_nums', {method: 'POST',
mode: 'cors',
headers: {
'Content-Type' : 'application/json'
},
body: JSON.stringify(calculation)}
).then((response) => {
...
getHistory(event){
/*
Making a GET request via a fetch call to Flask API to retrieve calculations history.
*/
event.preventDefault()
fetch('http://<NGINX_SERVICE_ENDPOINT>:8080/api/data', {method: 'GET',
mode: 'cors'
}
).then(response => {
...
Flask Dockerfile:
# using base image
FROM python:3.8
# setting working dir inside container
WORKDIR /addition_app_flask
# adding run.py to workdir
ADD run.py .
# adding config.ini to workdir
ADD config.ini .
# adding requirements.txt to workdir
ADD requirements.txt .
# installing flask requirements
RUN pip install -r requirements.txt
# adding in all contents from flask_app folder into a new flask_app folder
ADD ./flask_app ./flask_app
# exposing port 8000 on container
EXPOSE 8000
# serving flask backend through uWSGI server
CMD [ "python", "run.py" ]
run.py:
from gevent.pywsgi import WSGIServer
from flask_app.app import app
# As flask is not a production suitable server, we use will
# a WSGIServer instance to serve our flask application.
if __name__ == '__main__':
WSGIServer(('0.0.0.0', 8000), app).serve_forever()
app.py:
from flask import Flask, request, jsonify
from flask_app.storage import insert_calculation, get_calculations
app = Flask(__name__)
@app.route('/')
def index():
return "My Addition App", 200
@app.route('/health')
def health():
return '', 200
@app.route('/ready')
def ready():
return '', 200
@app.route('/data', methods=['GET'])
def data():
'''
Function used to get calculations history
from Postgres database and return to fetch call in frontend.
:return: Json format of either collected calculations or error message
'''
calculations_history = []
try:
calculations = get_calculations()
for key, value in calculations.items():
calculations_history.append(value)
return jsonify({'calculations': calculations_history}), 200
except:
return jsonify({'error': 'error fetching calculations history'}), 500
@app.route('/insert_nums', methods=['POST'])
def insert_nums():
'''
Function used to insert a calculation into our postgres
DB. Operands of operation received from frontend.
:return: Json format of either success or failure response.
'''
insert_nums = request.get_json()
firstNum, secondNum, answer = insert_nums['firstNum'], insert_nums['secondNum'], insert_nums['answer']
try:
insert_calculation(firstNum, secondNum, answer)
return jsonify({'Response': 'Successfully inserted into DB'}), 200
except:
return jsonify({'Response': 'Unable to insert into DB'}), 500
I can't tell what is going wrong. And I also wonder what should be the better way to debug such a cloud deployment case? Because in normal programs, we can set some breakpoints and print or log something to examine the root location of code that causes the problem, in cloud deployment, however, I lost my direction of debugging.
Thanks for everyone who have read this! Any comment is appreciated!
CodePudding user response:
...Exit Code was 139...
This could mean there's a bug in your Flask app. You can start with minimum spec instead of trying to do all in one goal:
apiVersion: v1
kind: Pod
metadata:
name: flask
labels:
component: flask
spec:
containers:
- name: flask
image: gcr.io/peerless-garage-330917/addition-app-flask:latest
ports:
- containerPort: 8000
See if your pod start accordingly. If it does, try connect to it kubectl port-forward <flask pod name> 8000:8000
, follow by curl localhost:8000/health
. You should watch your application at all time kubectl logs -f <flask pod name>
.
CodePudding user response:
Thanks for @gohm'c response! It is a good suggestion to isolate different parts and start from a smaller component. As suggested, I tried deploying a single flask pod first. Then I used
kubectl port-forward flask 8000:8000
to map the port to local machine. After using
curl localhost:8000/health
to access the port, it showed
Forwarding from 127.0.0.1:8000 -> 8000
Forwarding from [::1]:8000 -> 8000
Handling connection for 8000
E1112 18:52:15.874759 300145 portforward.go:400] an error occurred forwarding 8000 -> 8000: error forwarding port 8000 to pod 4870b939f3224f968fd5afa4660a5af7d10e144ee85149d69acff46a772e94b1, uid : failed to execute portforward in network namespace "/var/run/netns/cni-32f718f0-1248-6da4-c726-b2a5bf1918db": read tcp4 127.0.0.1:38662->127.0.0.1:8000: read: connection reset by peer
At this moment, using
kubectl logs -f flask
returned empty response. So there is indeed some issues in the flask app.
This health probing is a really simple function in app.py:
@app.route('/health')
def health():
return '', 200
How can I know if the route setting is wrong or not? Is it because of the WSGIServer in run.py?
from gevent.pywsgi import WSGIServer
from flask_app.app import app
# As flask is not a production suitable server, we use will
# a WSGIServer instance to serve our flask application.
if __name__ == '__main__':
WSGIServer(('0.0.0.0', 8000), app).serve_forever()
If we look at Dockerfile, it seems it exposes the correct port 8000. If I directly run
python run.py
on my laptop, I can successfully access localhost:8000 . How can I debug with this kind of problem?