Google-cloud-ruby: Google Cloud Natural Language - Deadline Exceeded and Client not respecting timeout

Created on 20 Jun 2019  路  11Comments  路  Source: googleapis/google-cloud-ruby

Hello,
Some of this may be anecdotal, but we have been seeing an increased number of 4:Deadline Exceeded errors in our production environment within the last week/several weeks as well as increased time spent in calling the analyze_entities endpoint/method using the Google::Cloud::Language client.

Environment details

  • OS: Mac OS 10.13.6
  • Ruby version: 2.6.3
  • Gem name and version: google-cloud-language (0.32.2)

Steps to reproduce

  1. Using example text from scraping http://marvin.cs.uidaho.edu/About/quotes.html
    (cutting down the text to 300,000 characters we are left with https://gist.github.com/bendillinger/5b695c00382dcbe204ca6f7833f52e41)
  2. Call analyze_entities on the text (with or without timeout on the client) we run into a 4:Deadline Exceeded error.

Code example

pry(main)> client = Google::Cloud::Language.new(credentials: Rails.root.join('config', 'google-cloud-credentials.json').to_s, timeout: 20)
pry(main)> text = File.read("sample.txt")
pry(main)> document = { content: text[0..300000], type: :PLAIN_TEXT }
pry(main)> response = client.analyze_entities(document, encoding_type: :UTF8)
Google::Gax::RetryError: GaxError Retry total timeout exceeded with exception, caused by 4:Deadline Exceeded
Caused by GRPC::DeadlineExceeded: 4:Deadline Exceeded
pry(main)> Benchmark.realtime do
pry(main)*   begin
pry(main)*     client.analyze_entities(document, encoding_type: :UTF8)
pry(main)*   rescue  
pry(main)*   end  
pry(main)* end 
600.0265560000007
pry(main)> document = { content: text[0..150000], type: :PLAIN_TEXT }
pry(main)> Benchmark.realtime{ response = client.analyze_entities(document, encoding_type: :UTF8) }
39.880080999999336

I'm not sure about the inner workings of the Natural Language API so I cannot tell if this is intended behavior, but the API is supposed to be able to handle 1,000,000 bytes/characters according to the quotas page. It seems like there is exponential increase in time with length of the text analyzed and I don't recall this being the case previously (weeks/months ago).

Also, as can be seen in the code example, a timeout of any number is not respected and will continue to process until the 10 minute hard cutoff with a return of 4:Deadline Exceeded.
For reference: the client initialization returns an instance variable of @timeout=1969-12-31 15:59:59 -0800 no matter what the input is.

Thanks for reading and let me know if anyone has had similar issues or if this is what we should be expecting.

language question

Most helpful comment

Thanks all, we've resolved our issue.

It appears that very large text can take longer than 20 seconds to process. The default retry configuration doesn't allow it to complete because it retries every 20 seconds until it finally fails after 10 minutes. That's why some requests take 19 seconds, and another request with 20% more text can take 10 minutes.

@blowmage I first tried loading and modifying the JSON config values, but it seems like that gets overridden by the DEFAULT_TIMEOUT

constructing an instance of Google::Gax::RetryOptions turned out to be pretty complicated so we ended up doing something like this and implementing our own retries:

Google::Gax::CallOptions.new(
  timeout: 30,
  retry_options: nil # don't retry
)

@quartzmo thanks, I'll try that next time.

All 11 comments

I think that the client configuration is overriding the timeout setting. The timeout value passed to Language.new will be sent to GRPC::ClientStub.new, so the value is being set. But the client configuration will override that by calculating a deadline value and passing it to GRPC::ClientStub#request_response calls.

I can confirm that this is still an issue.

Pasting the contents of this gist: https://gist.github.com/hangsu/f9ebe17d62200bc58168182414dffda3

into the demo here: https://cloud.google.com/natural-language/#natural-language-api-demo

doesn't complete within 5 minutes.

If I paste half of the contents of the gist into the demo, it completes within 20 seconds.

I'm wondering why this is labeled as a question. Requests are taking 10 minutes and setting a timeout simply doesn't work. Is this not a bug?

@hangsu Do you mean a bug in this client library or a bug in the Natural Language API service?

This isn't a bug, it's just terribly confusing. The timeout value is being properly set on the grpc client object, but the API call has JSON configuration that is overriding the value set on the client. If there was no JSON configuration the client's value would be used.

Another thing to make a note of is that this API is considered idempotent. Which means it will be retried when it does not succeed successfully. So, the timeout may be applied correctly, but multiple RPCs are being made.

I would try passing Google::Gax::CallOptions to the API methods and see if that overrides the API configuration. You can set a lower timeout and not allow the RPC to be retried.

client = Google::Cloud::Language.new(credentials: Rails.root.join('config', 'google-cloud-credentials.json').to_s)
text = File.read("sample.txt")
document = { content: text[0..300000], type: :PLAIN_TEXT }
custom_options = Google::Gax::CallOptions.new(timeout: 20, retry_codes: [])
response = client.analyze_entities(document, encoding_type: :UTF8, options: custom_options)

Or, you can try modifying the API method's JSON values.

config_json = File.read(Gem.find_files("google/cloud/language/v1/language_service_client_config.json").first)
custom_config = JSON.parse(config_json)
custom_config["interfaces"]["google.cloud.language.v1.LanguageService"]["retry_params"]["total_timeout_millis"] = 20000 # was 600000

client = Google::Cloud::Language.new(credentials: Rails.root.join('config', 'google-cloud-credentials.json').to_s, client_config: custom_config["interfaces"]["google.cloud.language.v1.LanguageService"])
text = File.read("sample.txt")
document = { content: text[0..300000], type: :PLAIN_TEXT }
response = client.analyze_entities(document, encoding_type: :UTF8)

@quartzmo Requests taking 10 minutes seems to be a bug with the Natural Language API service. timeout not being respected appeared to be a bug in the client library, though @blowmage has cleared it up for me (thank you).

I'm happy to take the slow request issue somewhere more appropriate. I've used the "Send Feedback" form in Google Cloud Console, but have never gotten a response that way even after months. We spend $thousands/mo on the Cloud Language API service.

Do you know of a better avenue for reporting API service issues?

Requests taking 10 minutes seems to be a bug with the Natural Language API service.

It looks to me that both the API service and the ruby client are behaving correctly based on what the ruby client's default configuration is. Consider what errors the API calls are retried on. What the RPC timeout is configured to. And what the total timeout is configured to.

@hangsu The only other way I know of is to open a question on Stack Overflow with the google-cloud-nl tag.

@hangsu The preferred channel for reporting service issues is: https://cloud.google.com/support/docs/issue-trackers, which has this link for Natural Language API. But before opening an issue, please consider @blowmage's comments on the timeout and retry configuration.

Thanks all, we've resolved our issue.

It appears that very large text can take longer than 20 seconds to process. The default retry configuration doesn't allow it to complete because it retries every 20 seconds until it finally fails after 10 minutes. That's why some requests take 19 seconds, and another request with 20% more text can take 10 minutes.

@blowmage I first tried loading and modifying the JSON config values, but it seems like that gets overridden by the DEFAULT_TIMEOUT

constructing an instance of Google::Gax::RetryOptions turned out to be pretty complicated so we ended up doing something like this and implementing our own retries:

Google::Gax::CallOptions.new(
  timeout: 30,
  retry_options: nil # don't retry
)

@quartzmo thanks, I'll try that next time.

Greetings, we're closing this. Looks like the issue got resolved. Please let us know if the issue needs to be reopened.

Was this page helpful?
0 / 5 - 0 ratings