After the upgrade of velero from 1.3.2 to 1.7.7 backups started failing. When describing the failing backup:
velero backup describe mypod-20220204170020 --details
Name: mypod-20220204170020
Namespace: velero
Labels: velero.io/schedule-name=mypod
velero.io/storage-location=default
Annotations: velero.io/source-cluster-k8s-gitversion=v1.21.2
velero.io/source-cluster-k8s-major-version=1
velero.io/source-cluster-k8s-minor-version=21
Phase: Failed (run `velero backup logs mypod-20220204170020` for more information)
Errors: 0
Warnings: 0
Namespaces:
Included: mypod
Excluded: <none>
Resources:
Included: *
Excluded: <none>
Cluster-scoped: auto
Label selector: <none>
Storage Location: default
Velero-Native Snapshot PVs: auto
TTL: 168h0m0s
Hooks: <none>
Backup Format Version: 1.1.0
Started: 2022-02-04 18:00:20 0100 CET
Completed: 2022-02-04 18:00:41 0100 CET
Expiration: 2022-02-11 18:00:20 0100 CET
Total items to be backed up: 64
Items backed up: 64
Resource List: <backup resource list not found>
Velero-Native Snapshots: <error getting snapshot info: file not found>
At first sight it seems like the backups were made correctly:
Errors: 0
Warnings: 0
Items backup: 64 of 64
However, right at the bottom (and only when adding the --details flag with the describe command), you see these two mentions:
Resource List: <backup resource list not found>
Velero-Native Snapshots: <error getting snapshot info: file not found>
Is there some clever way to troubleshoot this further? Or do someone have thoughts on what might be the issue here?
- Running on AKS (1.21.2)
- Using velero-plugin-for-microsoft-azure:v1.3.1 for snapshots to azure
Much appreciated!
CodePudding user response:
Not sure if this solves the issue for everyone, but it worked for me. After searching inside the velero
The hints at (2) in the source code led me to search on upload issues. As I am using the velero-plugin-for-microsoft-azure to handle uploads to Azure the following GitHub issue finally gave me the nudge in the right direction.
Seems like the velero-plugin-for-microsoft-azure required more memory (512Mi) since v1.5.3 (if I understood it correctly).
The limits on my one were still at 256Mi and failing, I increased the limits to 512Mi and, presto! It started working again.
A big shout out to David L. Smith-Uchida whose breadcrumbs led me to the solution!