Google-cloud-go: Convenience method for polling for operation completion

Created on 2 Oct 2015  路  13Comments  路  Source: googleapis/google-cloud-go

It's very common to start a GCE operation and poll for completion of the operation. It would be nice if the gcloud library could include a standard "correct" implementation of this polling, including adding a jitter to the sleep polling duartion and using ETags now that google-api-golang-client supports ETags.

func WaitForOp(op *compute.Operation, sleep, timeout time.Duration) error

This would poll on the status of the operation, sleeping sleep amount of time between polls, up to timeout total time spent waiting. If the operation is DONE, return nil, otherwise return an error indicating that something went wrong (wrapping the last view of the Operation resource) or that the wait timed out with ErrOperationTimeout.

Callers would look like:

op, _ := svc.Instances.Insert(...)
if err := compute.WaitForOp(op, time.Second, time.Minute); err != nil {
  log.Printf("insert operation %q failed: %v", op.Name, err)
}
feature request

All 13 comments

I think both parameters could probably go away. Too easy to transpose them if they have the same type anyway.

sleep can be automatic.

timeout can be done via a context or selecting on a returned channel.

Or we can do a variadic CloudOption sort of thing for extra options.

But whatever we do, I don't think we need two durations.

/cc @okdave

That's a good point. WaitForOp(op) is even better.

Any progress?

Sort of. Modern Google APIs support a common long-running operations framework, and we provide the cloud.google.com/go/longrunning package for interacting with that, which includes Wait and Poll methods.

For now, our support for the compute API is limited to the cloud.google.com/go/compute/metadata package in this repo, and the generated client.

Realistically, this ain't gonna happen. We have no plans for additional support for compute at this time.

@jba Would you care to comment what is the "idiomatic way" to poll for the completion of the operation? Maybe point to some example code?

Found the python and java examples here: https://cloud.google.com/compute/docs/api/how-tos/api-requests-responses

But I'm still missing a Go example.

It is frustrating that this is closed. Hundreds of projects out there will be implementing this themselves instead of just including it in this library.

Here's an example of polling in Go: https://github.com/GoogleCloudPlatform/golang-samples/blob/master/healthcare/dataset_deidentify.go#L75-L91

(Note: Need to be somewhat careful to check error conditions and break/continue in the right order.)

Edit: This is for an older client, not a new one using the longrunning package.

@zachberger which library are you referring to?

There's a lot of reference to the google.golang.org/api/compute/v1 package in this thread, which is actually not part of the google-cloud-go repo (this issue tracker).

For those REST generated clients under google.golang.org/api, unfortunately there's no common "operation" type, and even more unfortunate, the APIs tend to differ on how they expose operations, so I don't think it's feasible to have a generic "Wait for op" helper.

Yes, we could probably have helpers specific to each API, but that isn't a scalable approach to solving this problem.

As jba says, for our "modern" APIs (i.e., gRPC), there are helpers for long running operations. This is possible because there's a common operation type.

I'm sorry I didn't get back to this earlier. I've ended up implementing something similar to the python code I've referred above. Just for reference:

func waitForOperation(project, zone string, op *compute.Operation) error {
    for {
        result, err := service.ZoneOperations.Get(project, zone, op.Name).Do()
        if err != nil {
            return fmt.Errorf("Failed retriving operation status: %s", err)
        }

        if result.Status == "DONE" {
            if result.Error != nil {
                var errors []string
                for _, e := range result.Error.Errors {
                    errors = append(errors, e.Message)
                }
                return fmt.Errorf("Operation failed with error(s): %s", strings.Join(errors, ", "))
            }
            break
        }
        time.Sleep(time.Second)
    }
    return nil
}

Where service is a *compute.Service.

@tbpg do you see any drawbacks with the above?

LGTM. You might want to add a context.Context argument so it can be cancelled by the caller (didn't do this in the sample I linked above to keep it simple, maybe I should).

Something like this (untested; I also updated a couple of the error messages):

func waitForOperation(ctx context.Context, project, zone string, op *compute.Operation) error {
    ticker := time.NewTicker(1 * time.Second)
    defer ticker.Stop()

    for {
        select {
        case <-ctx.Done():
            return fmt.Errorf("timeout waiting for operation to complete")
        case <-ticker.C:
            result, err := service.ZoneOperations.Get(project, zone, op.Name).Do()
            if err != nil {
                return fmt.Errorf("ZoneOperations.Get: %s", err)
            }

            if result.Status == "DONE" {
                if result.Error != nil {
                    var errors []string
                    for _, e := range result.Error.Errors {
                        errors = append(errors, e.Message)
                    }
                    return fmt.Errorf("operation %q failed with error(s): %s", op.Name, strings.Join(errors, ", "))
                }
                return nil
            }
        }
    }
}
Was this page helpful?
0 / 5 - 0 ratings

Related issues

dragan-cikic-shortcut picture dragan-cikic-shortcut  路  3Comments

rntk picture rntk  路  3Comments

philippgille picture philippgille  路  3Comments

GlennAmmons picture GlennAmmons  路  3Comments

twoism picture twoism  路  4Comments