Rasa: unable to fetch model from GCS

Created on 15 Jul 2019 · 6Comments · Source: RasaHQ/rasa

Rasa version: 1.1.5-full (docker image)

Rasa X version (if used & relevant):

Python version: ?

Operating system (windows, osx, ...): linux

Issue: not able to fetch model from GCS through kubernetes. I've deployed a chatbot with kubernetes on GCP and rasa is trying to fetch models.tar.gz (obviously it is not found) GOOGLE_APPLICATION_CREDENTIALS and BUCKET_NAME (let's call it toto) are well setted. When the bot is starting, I have the following traceback.

Error (including full traceback):

2019-07-15 09:35:24 DEBUG    google.auth.transport.requests  - Making request: POST https://oauth2.googleapis.com/token
2019-07-15 09:35:25 ERROR    rasa.core.agent  - Could not load model due to 404 GET https://www.googleapis.com/download/storage/v1/b/toto/o/models.tar.gz?alt=media: ('Request failed with status code', 404, 'Expected one of', <HTTPStatus.OK: 200>, <HTTPStatus.PARTIAL_CONTENT: 206>).
[2019-07-15 09:35:25 +0000] [1] [ERROR] Experienced exception while trying to serve
Traceback (most recent call last):
  File "/usr/local/bin/rasa", line 10, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.6/site-packages/rasa/__main__.py", line 76, in main
    cmdline_arguments.func(cmdline_arguments)
  File "/usr/local/lib/python3.6/site-packages/rasa/cli/run.py", line 102, in run
    rasa.run(**vars(args))
  File "/usr/local/lib/python3.6/site-packages/rasa/run.py", line 54, in run
    **kwargs
  File "/usr/local/lib/python3.6/site-packages/rasa/core/run.py", line 172, in serve_application
    app.run(host="0.0.0.0", port=port)
  File "/usr/local/lib/python3.6/site-packages/sanic/app.py", line 1096, in run
    serve(**server_settings)
  File "/usr/local/lib/python3.6/site-packages/sanic/server.py", line 742, in serve
    trigger_events(before_start, loop)
  File "/usr/local/lib/python3.6/site-packages/sanic/server.py", line 604, in trigger_events
    loop.run_until_complete(result)
  File "uvloop/loop.pyx", line 1451, in uvloop.loop.Loop.run_until_complete
  File "/usr/local/lib/python3.6/site-packages/rasa/core/run.py", line 211, in load_agent_on_start
    action_endpoint=endpoints.action,
  File "/usr/local/lib/python3.6/site-packages/rasa/core/agent.py", line 251, in load_agent
    model_server=model_server,
  File "/usr/local/lib/python3.6/site-packages/rasa/core/agent.py", line 911, in load_from_remote_storage
    persistor.retrieve(model_name, target_path)
  File "/usr/local/lib/python3.6/site-packages/rasa/nlu/persistor.py", line 51, in retrieve
    self._retrieve_tar(tar_name)
  File "/usr/local/lib/python3.6/site-packages/rasa/nlu/persistor.py", line 206, in _retrieve_tar
    blob.download_to_filename(target_filename)
  File "/usr/local/lib/python3.6/site-packages/google/cloud/storage/blob.py", line 664, in download_to_filename
    self.download_to_file(file_obj, client=client, start=start, end=end)
  File "/usr/local/lib/python3.6/site-packages/google/cloud/storage/blob.py", line 638, in download_to_file
    _raise_from_invalid_response(exc)
  File "/usr/local/lib/python3.6/site-packages/google/cloud/storage/blob.py", line 2034, in _raise_from_invalid_response
    raise exceptions.from_http_status(response.status_code, message, response=response)
google.api_core.exceptions.NotFound: 404 GET https://www.googleapis.com/download/storage/v1/b/toto/o/models.tar.gz?alt=media: ('Request failed with status code', 404, 'Expected one of', <HTTPStatus.OK: 200>, <HTTPStatus.PARTIAL_CONTENT: 206>)

Command or request that led to error:

        command:
          - rasa
          - run
          - --enable-api 
          - --model 
          - 20190715-112628.tar.gz
          - --log-file 
          - out.log 
          - --remote-storage
          - gcs
          - --credentials 
          - credentials.yaml
          - --debug

Content of configuration file (config.yml) (if relevant):

Content of domain file (domain.yml) (if relevant):

area help wanted type

Source

tormath1

Most helpful comment

@tormath1 Thanks for raising this issue. It is indeed a bug. Are you up for fixing this yourself and open a PR for it?

In case of --enable-api we validate that the model path exists (https://github.com/RasaHQ/rasa/blob/master/rasa/cli/run.py#L86). If not we override the argument model with the default location models. However, if you want to load the model from the remote storage the model does not need to exist locally. This needs to be changed.

Additionally, we should check if this line still makes sense: https://github.com/RasaHQ/rasa/blob/master/rasa/nlu/persistor.py#L49
We add the ending tar.gz to the provided model name. However, it seems like that the ending is also added in case it already exists.

tabergma on 15 Jul 2019

🚀2

All 6 comments

FYI, the same command on my linux machine through virtualenv:

$ rasa --version
Rasa 1.1.4
$ python -V
Python 3.7.3
$ rasa run -m 20190715-112628.tar.gz --enable-api --log-file out.log --remote-storage gcs --debug
2019-07-15 11:42:53 DEBUG    rasa.cli.run  - '20190715-112628.tar.gz' not found. Using default location 'models' instead.
2019-07-15 11:42:53 DEBUG    rasa.cli.utils  - Parameter 'endpoints' not set. Using default location 'endpoints.yml' instead.
2019-07-15 11:42:53 DEBUG    rasa.cli.utils  - Parameter 'credentials' not set. Using default location 'credentials.yml' instead.
2019-07-15 11:42:54 DEBUG    rasa.model  - Extracted model to '/tmp/tmpk70j36h8'.
2019-07-15 11:42:54 DEBUG    rasa.core.utils  - Available web server routes: 
/conversations/<conversation_id>/messages          POST                           add_message
/conversations/<conversation_id>/tracker/events    POST                           append_events
/webhooks/rasa                                     GET                            custom_webhook_RasaChatInput.health
/webhooks/rasa/webhook                             POST                           custom_webhook_RasaChatInput.receive
/webhooks/rest                                     GET                            custom_webhook_RestInput.health
/webhooks/rest/webhook                             POST                           custom_webhook_RestInput.receive
/model/test/intents                                POST                           evaluate_intents
/model/test/stories                                POST                           evaluate_stories
/conversations/<conversation_id>/execute           POST                           execute_action
/domain                                            GET                            get_domain
/                                                  GET                            hello
/model                                             PUT                            load_model
/model/parse                                       POST                           parse
/conversations/<conversation_id>/predict           POST                           predict
/conversations/<conversation_id>/tracker/events    PUT                            replace_events
/conversations/<conversation_id>/story             GET                            retrieve_story
/conversations/<conversation_id>/tracker           GET                            retrieve_tracker
/status                                            GET                            status
/model/predict                                     POST                           tracker_predict
/model/train                                       POST                           train
/model                                             DELETE                         unload_model
/version                                           GET                            version
2019-07-15 11:42:54 INFO     root  - Starting Rasa Core server on http://localhost:5005
2019-07-15 11:42:54 INFO     root  - Enabling coroutine debugging. Loop id 94659794907560.
2019-07-15 11:42:54 DEBUG    rasa.model  - Extracted model to '/tmp/tmpoiu9hitk'.
2019-07-15 11:42:54 DEBUG    rasa.model  - Extracted model to '/tmp/tmpxgv3wgg6'.
2019-07-15 11:42:54 DEBUG    pykwalify.compat  - Using yaml library: /home/mathieu/github.com/optelgroup/optelcloud/user-xp/bot/venv/lib/python3.7/site-packages/ruamel/yaml/__init__.py

It seems that there is something broken between 1.1.4 and 1.1.5 ?

tormath1 on 15 Jul 2019

Hey @tormath1, we did indeed see some breaking there between 1.1.4 and 1.1.5 -- if all went well it was fixed in 1.1.6. Can you try updating and see if the issue persists?

erohmensing on 15 Jul 2019

🚀1

Hey @erohmensing, no more luck with 1.16 :confused:

$ rasa --version
Rasa 1.1.6
$ rasa run -m 20190715-112628.tar.gz --enable-api --log-file out.log --remote-storage gcs --debug
2019-07-15 14:21:59 DEBUG    rasa.cli.utils  - Parameter 'endpoints' not set. Using default location 'endpoints.yml' instead.
2019-07-15 14:21:59 DEBUG    rasa.cli.utils  - Parameter 'credentials' not set. Using default location 'credentials.yml' instead.
2019-07-15 14:21:59 DEBUG    rasa.cli.run  - '20190715-112628.tar.gz' not found. Using default location 'models' instead.
2019-07-15 14:22:00 DEBUG    rasa.core.utils  - Available web server routes: 
/conversations/<conversation_id>/messages          POST                           add_message
/conversations/<conversation_id>/tracker/events    POST                           append_events
/webhooks/rasa                                     GET                            custom_webhook_RasaChatInput.health
/webhooks/rasa/webhook                             POST                           custom_webhook_RasaChatInput.receive
/webhooks/rest                                     GET                            custom_webhook_RestInput.health
/webhooks/rest/webhook                             POST                           custom_webhook_RestInput.receive
/model/test/intents                                POST                           evaluate_intents
/model/test/stories                                POST                           evaluate_stories
/conversations/<conversation_id>/execute           POST                           execute_action
/domain                                            GET                            get_domain
/                                                  GET                            hello
/model                                             PUT                            load_model
/model/parse                                       POST                           parse
/conversations/<conversation_id>/predict           POST                           predict
/conversations/<conversation_id>/tracker/events    PUT                            replace_events
/conversations/<conversation_id>/story             GET                            retrieve_story
/conversations/<conversation_id>/tracker           GET                            retrieve_tracker
/status                                            GET                            status
/model/predict                                     POST                           tracker_predict
/model/train                                       POST                           train
/model                                             DELETE                         unload_model
/version                                           GET                            version
2019-07-15 14:22:00 INFO     root  - Starting Rasa Core server on http://localhost:5005
2019-07-15 14:22:00 INFO     root  - Enabling coroutine debugging. Loop id 94536706993880.
2019-07-15 14:22:00 DEBUG    root  - Could not load interpreter from 'models'.
2019-07-15 14:22:00 DEBUG    rasa.core.tracker_store  - Connected to InMemoryTrackerStore.
2019-07-15 14:22:00 DEBUG    google.auth.transport.requests  - Making request: POST https://oauth2.googleapis.com/token
2019-07-15 14:22:01 ERROR    rasa.core.agent  - Could not load model due to 404 GET https://www.googleapis.com/download/storage/v1/b/toto/o/models.tar.gz?alt=media: ('Request failed with status code', 404, 'Expected one of', <HTTPStatus.OK: 200>, <HTTPStatus.PARTIAL_CONTENT: 206>).

Actually, the doc around google cloud storage is pretty confusing. Maybe I'm missing something but:

GOOGLE_APPLICATION_CREDENTIALS is pointing to a valid json file with sufficient role
BUCKET_NAME is recognize (we can see toto in the URL)
20190715-112628.tar.gz has been pushed manually and it's at the root of the bucket

tormath1 on 15 Jul 2019

Hm okay, in 1.1.4 it looks like it may not have actually pulled the model from gcloud but rather pulled from a local models (since it didn't show the google.auth.transport.requests making request log (this is why we changed it, as it was defaulting to pulling models from local before checking the other options).

Have you been able to load it successfully from gcloud at all?

What I think might be happening is that the name of your models is being overwritten with models -- if you check the GET request it's looking for models.tar.gz instead of your actual model name. @tabergma could you take a look?

erohmensing on 15 Jul 2019

@tormath1 Thanks for raising this issue. It is indeed a bug. Are you up for fixing this yourself and open a PR for it?