Pubsub
Docker container running on a GCE instance.
No intermittent auth errors.
We occasionally see
rpc error: code = Unauthenticated desc = Request had invalid authentication
credentials. Expected OAuth 2 access token, login cookie or other valid authentication
credential. See https://developers.google.com/identity/sign-in/web/devconsole-project.
From sub.Receive. If we wrap the Receive call in a retry, this issue clears up within about a second.
Running on 0.25.0 of this library.
Also have seen rpc error: code = Unavailable desc = Authentication backend unavailable.
And rpc error: code = Unavailable desc = there is no connection available
As well as rpc error: code = Unavailable desc = the connection is draining.
Would this last one occur because we cancelled the context?
Are you using a service account or the default credentials? You should always use a service account in production. See #1047.
We are definitely using a service account with the instance.
Do we need to do anything special when creating the pubsub client and receiving to use the service account credentials?
@jba whats the file name?
If we're running the instance under the service account, do we need to do that?
@broady @jadekler @zombiezen any thoughts here?
Hi @nhooyr. When you create a service account, the next step is to create a key for that service account. The key is downloaded in the form of a JSON file.
Now that you have a json file - let's say it lives at /private/creds.json - you can do:
client, err := pubsub.NewClient(ctx, projID, option.WithCredentialsFile("/private/creds.json"))
Without this option, it's likely using default credentials which I believe are more transient - and therefore liable to failure (spitballing, but it's a reasonable guess as to why you're seeing errors I think).
If we're running the instance under the service account, do we need to do that?
I believe so, yes.
@jadekler
That seems wrong to me. Your linked documentation says
To use a service account outside of the Google Cloud Platform (on other platforms or on premise), you must establish the identity of the service account. Public/private key pairs will let you do that.
Those keys are only for when using a service account outside of GCP.
We sshed into an instance where this intermittent errors were occurring and ran:
$ toolbox
$ gcloud auth list
And there was only one account listed, the service account. So I don't think it because we're using the wrong credentials.
@nhooyr it's quite common to use service accounts like this even within GCP. You /must/ use a service account key outside GCP. When running inside GCP, you can use the metadata server, but you can /also/ use a service account key. Additionally, using the metadata server /does/ mean you end up using a service account (the default service account).
How often is this error happening? What percentage of requests? Or does it happen periodically?
If you could run:
creds, err := google.FindDefaultCredentials(ctx)
log.Print(creds.ProjectID == "", err)
How often is this error happening? What percentage of requests? Or does it happen periodically?
Have not measured how often but it occurs periodically for a second or so and then just stops.
Will run the code soon and get back to you.
@broady that returns
2018/08/09 17:38:10 false <nil>
(I'm @nhooyr's coworker)
Here's our data around when this occurs: https://docs.google.com/spreadsheets/d/1eh_rFRMLl3nMAxpuWaxUYIOZ32l2Qabke2yQk_ThWXg/edit#gid=1967454854
https://bigquery.cloud.google.com/table/coder-production:public_debug.failed_to_receive?tab=schema
In case anyone cannot access @ammario's big query link, you'll need to enable the big query API and set up billing on gcloud.
I _think_ that shows that you're using default credentials, though I'm less familiar with auth. Friendly ping to @broady.
Others have also reported similar issues in Java and Python. I think it's possible this might caused by a server side error, but I don't have a clue how to start investigating this :(
That looks a lot more frequent than I'd expect. Your auth looks good (looks like a service account).
I agree, @pongad, this does look like a backend problem. I wonder if there are any logs on the backend that would be useful.
@ammario @nhooyr if you have a support account with GCP, could you please file a support ticket and link to this thread?
I believe I figured this out, turns out we had a context with a timeout being passed in to the client constructor.
Nvm, still seeing this rpc error: code = Unauthenticated desc = Request had invalid authentication credentials. Expected OAuth 2 access token, login cookie or other valid authentication credential. See https://developers.google.com/identity/sign-in/web/devconsole-project.
Well, this is embarrassing, we were closing the pubsub client early.
Whoops! Generally, you'd just keep the pub/sub client open for the duration of your program's life.
Wasn't that, still saw this. Issue was we were doing multiple Receive's for the same sub id and that was causing issues. Was due to the way we structured our app, it was easier to do multiple Receive's vs a single one and threading the messages throughout our app.
Fixed now by using only a single Receive.