Victoriametrics: vm-backup googleapi: Error 503: Backend Error

Created on 6 Apr 2020  路  7Comments  路  Source: VictoriaMetrics/VictoriaMetrics

Describe the bug
During backup with

        {{ victoriametrics_root }}/vmbackup-prod
        -dst gcs://{{ vmbackup_bucket_path }}
        -snapshotName {{ snapshot_name }}
        -storageDataPath {{ victoriametrics_data_dir }}
        -origin gcs://{{ vmbackup_bucket_path }}
        -maxBytesPerSecond {{ vmbackup_max_upload_speed }}
        -loggerOutput {{ vmbackup_logging }}
        -concurrency {{ vmbackup_concurency }}
        -credsFilePath {{ vmbackup_creds_file }}

daily backup failed. On all three storage nodes with googleapi: Error 503: Backend Error

To Reproduce
no idea. got once. Probably we should retry here

Expected behavior
Backup done correctly

Screenshots
If applicable, add screenshots to help explain your problem.

Version
vmbackup-20200303-194040-tags-v1.34.2-0-g0b1e877a

$ ./victoria-metrics-prod --version
victoria-metrics-20190730-121249-heads-single-node-0-g671d9e55
enhancement question

Most helpful comment

ok. for now I've added

    - name: Copy backup to GCS storage. With file auth. Should take a while
      command: >
        {{ victoriametrics_root }}/vmbackup-prod
        -dst gcs://{{ vmbackup_bucket_path }}
        -snapshotName {{ snapshot_name }}
        -storageDataPath {{ victoriametrics_data_dir }}
        -origin gcs://{{ vmbackup_bucket_path }}
        -maxBytesPerSecond {{ vmbackup_max_upload_speed }}
        -loggerOutput {{ vmbackup_logging }}
        -concurrency {{ vmbackup_concurency }}
        -credsFilePath {{ vmbackup_creds_file }}
      async: "{{ vmbackup_backup_duration }}"
      poll: 30
      retries: 3
      delay: 60
      register: result
      until: result.rc == 0

and will see if it will retry

All 7 comments

This is expected condition. It can occur because of expected intermittent errors on GCS side. Just re-run vmbackup with the same set of command-line flags as described at https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmbackup/README.md#troubleshooting . vmbackup should resume the backup process from the interruption place

Probably, vmbackup could handle this error by itself by retrying the failed operation instead of returning the error to the caller. Let's keep this issue open in order to think about this case.

ok. for now I've added

    - name: Copy backup to GCS storage. With file auth. Should take a while
      command: >
        {{ victoriametrics_root }}/vmbackup-prod
        -dst gcs://{{ vmbackup_bucket_path }}
        -snapshotName {{ snapshot_name }}
        -storageDataPath {{ victoriametrics_data_dir }}
        -origin gcs://{{ vmbackup_bucket_path }}
        -maxBytesPerSecond {{ vmbackup_max_upload_speed }}
        -loggerOutput {{ vmbackup_logging }}
        -concurrency {{ vmbackup_concurency }}
        -credsFilePath {{ vmbackup_creds_file }}
      async: "{{ vmbackup_backup_duration }}"
      poll: 30
      retries: 3
      delay: 60
      register: result
      until: result.rc == 0

and will see if it will retry

FYI, the next release will contain updated package for cloud.google.com/go/storage, which includes automatic retry on most errors - see release notes for cloud.google.com/go/storage v1.9.0 for details.

@freeseacher , could you check whether vmbackup from new releases still emits googleapi: Error 503: Backend Error?

Probably no.
My recent backups ran perfectly.
I will definitely reopen this issue if I ever got 503 from google again.

Thanks for the update! Then closing this issue in the hope it has been fixed at Google cloud SDK side.

Was this page helpful?
0 / 5 - 0 ratings