Describe the bug
During backup with
{{ victoriametrics_root }}/vmbackup-prod
-dst gcs://{{ vmbackup_bucket_path }}
-snapshotName {{ snapshot_name }}
-storageDataPath {{ victoriametrics_data_dir }}
-origin gcs://{{ vmbackup_bucket_path }}
-maxBytesPerSecond {{ vmbackup_max_upload_speed }}
-loggerOutput {{ vmbackup_logging }}
-concurrency {{ vmbackup_concurency }}
-credsFilePath {{ vmbackup_creds_file }}
daily backup failed. On all three storage nodes with googleapi: Error 503: Backend Error
To Reproduce
no idea. got once. Probably we should retry here
Expected behavior
Backup done correctly
Screenshots
If applicable, add screenshots to help explain your problem.
Version
vmbackup-20200303-194040-tags-v1.34.2-0-g0b1e877a
$ ./victoria-metrics-prod --version
victoria-metrics-20190730-121249-heads-single-node-0-g671d9e55
This is expected condition. It can occur because of expected intermittent errors on GCS side. Just re-run vmbackup with the same set of command-line flags as described at https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmbackup/README.md#troubleshooting . vmbackup should resume the backup process from the interruption place
Probably, vmbackup could handle this error by itself by retrying the failed operation instead of returning the error to the caller. Let's keep this issue open in order to think about this case.
ok. for now I've added
- name: Copy backup to GCS storage. With file auth. Should take a while
command: >
{{ victoriametrics_root }}/vmbackup-prod
-dst gcs://{{ vmbackup_bucket_path }}
-snapshotName {{ snapshot_name }}
-storageDataPath {{ victoriametrics_data_dir }}
-origin gcs://{{ vmbackup_bucket_path }}
-maxBytesPerSecond {{ vmbackup_max_upload_speed }}
-loggerOutput {{ vmbackup_logging }}
-concurrency {{ vmbackup_concurency }}
-credsFilePath {{ vmbackup_creds_file }}
async: "{{ vmbackup_backup_duration }}"
poll: 30
retries: 3
delay: 60
register: result
until: result.rc == 0
and will see if it will retry
FYI, the next release will contain updated package for cloud.google.com/go/storage, which includes automatic retry on most errors - see release notes for cloud.google.com/go/storage v1.9.0 for details.
@freeseacher , could you check whether vmbackup from new releases still emits googleapi: Error 503: Backend Error?
Probably no.
My recent backups ran perfectly.
I will definitely reopen this issue if I ever got 503 from google again.
Thanks for the update! Then closing this issue in the hope it has been fixed at Google cloud SDK side.
Most helpful comment
ok. for now I've added
and will see if it will retry