I am building postgresql on kubernetes, but I cannot persist postgres data. The host is a GCP instance with Ubuntu-20.4. The volumes for disk are using GCP volumes, which are mounted after attaching to the instance.
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/root 9982728 4252592 5713752 43% /
devtmpfs 4038244 0 4038244 0% /dev
tmpfs 4059524 0 4059524 0% /dev/shm
tmpfs 811908 1568 810340 1% /run
tmpfs 5120 0 5120 0% /run/lock
tmpfs 4059524 0 4059524 0% /sys/fs/cgroup
/dev/loop0 50304 50304 0 100% /snap/core18/2671
/dev/loop2 217856 217856 0 100% /snap/google-cloud-cli/98
/dev/loop3 94208 94208 0 100% /snap/lxd/24065
/dev/loop1 60544 60544 0 100% /snap/core20/1782
/dev/loop4 44032 44032 0 100% /snap/snapd/17885
/dev/nvme0n1p15 99801 6004 93797 7% /boot/efi
tmpfs 811904 0 811904 0% /run/user/1001
shm 65536 0 65536 0% /run/containerd/io.containerd.grpc.v1.cri/sandboxes/49ebcf7b449f4b13d52aab6f52c28f139c551f83070f6f21207dbf52315dc264/shm
shm 65536 0 65536 0% /run/containerd/io.containerd.grpc.v1.cri/sandboxes/e46f0c6b19e5ccff9bb51fa3f7669a9a6a2e7cfccf54681e316a9cd58183dce4/shm
shm 65536 0 65536 0% /run/containerd/io.containerd.grpc.v1.cri/sandboxes/495c80e87521bfdda55827df64cdb84cddad149fb502ac7ee12f3607badd4649/shm
shm 65536 0 65536 0% /run/containerd/io.containerd.grpc.v1.cri/sandboxes/443e7b254d02c88873c59edc6d5b0a71e80da382ea81105e8b312ad4122d694a/shm
/dev/nvme0n3 10218772 12 9678088 1% /var/lib/postgresql ※ disk for postgres
/dev/nvme0n2 3021608 24 2847916 1% /var/lib/pgadmin ※ disk for pgadmin
shm 65536 1052 64484 2% /run/containerd/io.containerd.grpc.v1.cri/sandboxes/bd83982e91b6a3bce7853416d72878d5473174e884c15578c47a8d8952f4e718/shm
Also, the pod volume is allocated using persistent volume and persistent volume claim.
kind: PersistentVolume
apiVersion: v1
metadata:
name: postgres-pv-volume # Sets PV's name
labels:
app: postgres
spec:
# storageClassName: local-storage
capacity:
storage: 10Gi # Sets PV Volume
accessModes:
- ReadWriteMany
hostPath:
path: "/var/lib/postgresql"
---
kind: PersistentVolume
apiVersion: v1
metadata:
name: pgadmin-pv-volume # Sets PV's name
labels:
app: pgadmin
spec:
# storageClassName: local-storage
capacity:
storage: 3Gi # Sets PV Volume
accessModes:
- ReadWriteMany
hostPath:
path: "/var/lib/pgadmin"
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgresql
labels:
app: postgresql
spec:
replicas: 1
selector:
matchLabels:
app: postgresql
template:
metadata:
labels:
app: postgresql
spec:
containers:
- image: docker.io/yamamuratkr/postgres
name: postgresql
env:
- name: POSTGRES_USER
valueFrom:
secretKeyRef:
name: postgres
key: database_user
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: postgres
key: database_password
- name: PGDATA
value: "/var/lib/postgresql/data/pgdata"
ports:
- containerPort: 5432
name: postgresql
volumeMounts:
- name: postgredb
mountPath: /var/lib/postgresql
volumes:
- name: postgredb
persistentVolumeClaim:
claimName: postgres-pv-claim
However, when the pod is deleted, the postgres data also disappears with it, and I can confirm that it is not persistent.
If you know the cause of this problem, please let me know. Thank you in advance.
None of the following worked
- Using hostpath for pod volumes
- Use default PGDATA
CodePudding user response:
The source of your problem is here:
volumeMounts:
- name: postgredb
mountPath: /var/lib/postgresql
The postgres
image itself mounts a volume on /var/lib/postgresql/data
. We can see that if we inspect the image:
$ docker image inspect docker.io/postgres:14 | jq '.[0].ContainerConfig.Volumes'
{
"/var/lib/postgresql/data": {}
}
Your mount on /var/lib/postgresql
is effectively a no-op. An ephemeral volume is created on /var/lib/postgresql/data
each time the container starts, and since that's the default PGDATA
location, your data is effectively discarded when the container exits.
I've put together an example in my local environment to demonstrate this behavior. I've made a few minor changes from your example that shouldn't operationally impact anything.
I've created the following Secret with the postgres credentials; by naming the keys like this we can use a single envFrom
block instead of multiple env
entries (see the Deployment for details):
apiVersion: v1
kind: Secret
metadata:
name: postgres-env
type: Opaque
stringData:
POSTGRES_USER: example
POSTGRES_PASSWORD: secret
And in line with my comment I'm injecting a file into /docker-entrypoint-initdb.d
from this ConfigMap:
apiVersion: v1
kind: ConfigMap
metadata:
name: postgres-config
data:
initdb.sql: |
CREATE TABLE people (
id SERIAL,
name VARCHAR(40),
favorite_color VARCHAR(10)
);
INSERT INTO people (name, favorite_color) VALUES ('bob', 'red');
INSERT INTO people (name, favorite_color) VALUES ('alice', 'blue');
INSERT INTO people (name, favorite_color) VALUES ('mallory', 'purple');
I'm using this PersistentVolumeClaim:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-pv-claim
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
And finally I'm tying it all together in this Deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: postgres
name: postgresql
spec:
replicas: 1
strategy:
type: Recreate
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- envFrom:
- secretRef:
name: postgres-env
image: docker.io/postgres:14
name: postgresql
ports:
- containerPort: 5432
name: postgresql
volumeMounts:
- mountPath: /docker-entrypoint-initdb.d
name: postgres-config
- mountPath: /var/lib/postgresql
name: postgres-data
volumes:
- name: postgres-data
persistentVolumeClaim:
claimName: postgres-pv-claim
- configMap:
name: postgres-config
name: postgres-config
The significant changes here are:
I'm using a single
envFrom
block to set environment variables from the keys in thepostgres-env
secret.I'm using the upstream
docker.io/postgres:14
image rather than building my own custom image.I'm injecting the contents of
/docker-entrypoint-initdb.d
from the postgres-config ConfigMap.
Note that this deployment is using the same mountPath
as in your example`.
If I bring up this environment, I can see that the database initialization script was executed correctly. The people
table exists and has the expected data:
$ kubtectl exec -it deploy/postgresql -- psql -U example -c 'select * from people'
id | name | favorite_color
---- --------- ----------------
1 | bob | red
2 | alice | blue
3 | mallory | purple
(3 rows)
Let's make a change to the database and see what happens when we restart the pod. First, we add a new row to the table and view the updated table:
$ kubectl exec -it deploy/postgresql -- psql -U example -c "insert into people (name, favorite_color) values ('carol', 'green')"
INSERT 0 1
$ kubectl exec -it deploy/postgresql -- psql -U example -c 'select * from people'
id | name | favorite_color
---- --------- ----------------
1 | bob | red
2 | alice | blue
3 | mallory | purple
4 | carol | green
(4 rows)
Now we restart the database pod:
$ kubectl rollout restart deployment/postgresql
And finally check if our changes survived the restart:
$ kubectl exec -it deploy/postgresql -- psql -U example -c 'select * from people' id | name | favorite_color
---- --------- ----------------
1 | bob | red
2 | alice | blue
3 | mallory | purple
(3 rows)
As expected, they did not! Let's change the mountPath
in the Deployment so that it looks like this instead:
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: postgres
name: postgresql
spec:
replicas: 1
strategy:
type: Recreate
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- envFrom:
- secretRef:
name: postgres-env
image: docker.io/postgres:14
name: postgresql
ports:
- containerPort: 5432
name: postgresql
volumeMounts:
- mountPath: /docker-entrypoint-initdb.d
name: postgres-config
- mountPath: /var/lib/postgresql/data
name: postgres-data
volumes:
- name: postgres-data
persistentVolumeClaim:
claimName: postgres-pv-claim
- configMap:
name: postgres-config
name: postgres-config
Using this Deployment, with no other changes, we can re-run the previous test and see that our data persists as desired:
$ kubectl exec -it deploy/postgresql -- psql -U example -c "insert into people (name, favorite_color) values ('carol', 'green')"
INSERT 0 1
$ k rollout restart deployment/postgresql
deployment.apps/postgresql restarted
$ k exec -it deploy/postgresql -- psql -U example -c 'select * from people' id | name | favorite_color
---- --------- ----------------
1 | bob | red
2 | alice | blue
3 | mallory | purple
4 | carol | green
(4 rows)
An alternative solution would be to mount your volume in a completely different location and then set PGDATA
appropriately. E.g.,
...
env:
- name: PGDATA
value: /data
...
volumeMounts:
- name: postgres-data
mountPath: /data
...