-
Notifications
You must be signed in to change notification settings - Fork 196
Backups no longer work after upgrading to 1.17.0 #2066
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @wonko, We tried to reproduce this quickly but we couldn't. Could you please share your |
I assume you'd only need the backup part (i've left out the pxc en haproxy sections). We use the helm chart for DB deployment (from https://github.com/percona/percona-helm-charts/tree/main/charts/pxc-db). This is the terraform template snippet, the variables are obviously filled in... The template wasn't touched going from 1.16.1 to 1.17.0.
|
@wonko as you can see, you have
From 1.17.0, we started to use AWS CLI instead of Minio CLI in our backup images. I try to google this error |
This results in a different error, indicating that I should be using the correct endpoint. I double checked, the bucket is in the
|
After some debugging, I must conclude that I've hit two bugs:
So, I guess this will be fixed when aws.sh is changed:
The line setting the AWS_ENDPOINT_URL should be conditional, and only be executed when it is provided. If it's not provided, don't set it, don't export any value there, let the aws-cli figure out what the endpoint is. Proof:
And I guess you tested in the |
(fyi, aws/aws-cli#9479 has the report for aws-cli.) |
@wonko thanks for helping with the debug. The code of aws.sh is located in a different repo https://github.com/percona/percona-docker/blob/main/percona-xtradb-cluster-8.4-backup/lib/pxc/aws.sh#L6-L7, but we will move it under the 'percona-xtradb-cluster-operator' repo in the next release. As I can see, the default ENDPOINT was the same for MinIO CLI, but the behavior was different. We can fix the problem in two different ways:
I think it's better to use the second way to follow AWS CLI logic. As you can see, AWS CLI uses AWS_REGION to set the default AWS_ENDPOINT_URL.
|
I'd suggest to leave the endpoint-construction op to the AWS CLI code, as they own the logic to it. Replicating that logic might lead to other things not working ... No endpoint set by the user, no endpoint set in the script feels most logical to me. But that's only my opinion, feel free to ignore it ;-) |
We are also encountering this issue, is there some workaround for the time being, or is it safe to downgrade the operator? 🤔 |
I currently solved it by setting the
I'll remove that line again when this is fixed in the code. |
Report
After a operator and cluster upgrade towards 1.17.0 (crds + helm upgrade), backup jobs no longer complete (which worked fine in 1.16.1). The log for a backup is below, with target bucket and db names redacted.
Seems like it tries to delete a non-existing key, and then fails ... Any pointer what might be wrong?
The image which is used by the pod is
percona/percona-xtradb-cluster-operator:1.17.0-pxc8.0-backup-pxb8.0.35
. I can't seem to find the source of/usr/bin/backup.sh
in that image to allow me to debug this further.More about the problem
Steps to reproduce
Hard to tell how to reproduce, but we had a 1.16.1 setup with backups towards S3 configured and working fine, for many weeks. After an upgrade to 1.17.0, backups started to fail.
Versions
Anything else?
No response
The text was updated successfully, but these errors were encountered: