The third line of this block (from https://github.com/GoogleCloudPlatform/google-cloud-go/blob/master/internal/retry.go#L49) is problematic:
if cerr := sleep(ctx, p); cerr != nil {
if lastErr != nil {
return fmt.Errorf("%v; last function err: %v", cerr, lastErr)
}
return cerr
}
Oftentimes we see errors like this (e.g. from the datastore client):
context deadline exceeded; last function err: rpc error: code = DeadlineExceeded desc = context deadline exceeded
The problem is that a grpc error (DeadlineExceeded) was returned by the actual datastore call, but then this was wrapped in a string by the call to fmt.Errorf. So when the error above is finally returned to our application code, there is nothing we can do (short of gross string matching) to inspect the error and understand what happened. In particular all of our usual tools for detecting whether the error is some kind of timeout (e.g. https://godoc.org/google.golang.org/grpc/status#FromError) are useless.
I don't have a strong feeling about how to solve this; anything that would give me access to the original error (lastErr) is fine. Personally I would probably just simplify the code to:
if lastErr != nil {
return lastErr
}
return cerr
I have an idea about how to fix that. I'll see if I have time today.
https://code-review.googlesource.com/c/gocloud/+/19151. See if that works for you.
馃挴 thanks for the super fast turnaround @jba !
Most helpful comment
I have an idea about how to fix that. I'll see if I have time today.