Rasa: ConveRT model fails to load due to updated TensorFlow Hub module URL

Created on 22 Sep 2020  路  10Comments  路  Source: RasaHQ/rasa

Rasa version:

Python version:

3.7.7

Operating system (windows, osx, ...):

Tested on OS X, Linux, probably occurs on Windows too

Issue:

The hosting location of the ConveRT Tensorflow Hub module has been updated. It now is available from the following URL:

https://github.com/PolyAI-LDN/polyai-models/releases/download/v1.0/model.tar.gz

The old hosting location has been taken down. This means that any Rasa model currently using ConveRT without the TFHub module cached cannot be loaded.

Error (including full traceback):

---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
~/.pyenv/versions/3.7.7/Python.framework/Versions/3.7/lib/python3.7/site-packages/rasa/utils/train_utils.py in load_tf_hub_model(model_url)
    168     try:
--> 169         return tfhub.load(model_url)
    170     except OSError:

~/.pyenv/versions/3.7.7/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow_hub/module_v2.py in load(handle, tags)
     96     raise ValueError("Expected a string, got %s" % handle)
---> 97   module_path = resolve(handle)
     98   is_hub_module_v1 = tf.io.gfile.exists(

~/.pyenv/versions/3.7.7/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow_hub/module_v2.py in resolve(handle)
     52   """
---> 53   return registry.resolver(handle)
     54 

~/.pyenv/versions/3.7.7/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow_hub/registry.py in __call__(self, *args, **kwargs)
     41       if impl.is_supported(*args, **kwargs):
---> 42         return impl(*args, **kwargs)
     43     raise RuntimeError(

~/.pyenv/versions/3.7.7/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow_hub/compressed_module_resolver.py in __call__(self, handle)
     87     return resolver.atomic_download(handle, download, module_dir,
---> 88                                     self._lock_file_timeout_sec())
     89 

~/.pyenv/versions/3.7.7/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow_hub/resolver.py in atomic_download(handle, download_fn, module_dir, lock_file_timeout_sec)
    414     tf_v1.gfile.MakeDirs(tmp_dir)
--> 415     download_fn(handle, tmp_dir)
    416     # Write module descriptor to capture information about which module was

~/.pyenv/versions/3.7.7/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow_hub/compressed_module_resolver.py in download(handle, tmp_dir)
     82       request = url.Request(_append_compressed_format_query(handle))
---> 83       response = self._call_urlopen(request)
     84       return resolver.DownloadManager(handle).download_and_uncompress(

~/.pyenv/versions/3.7.7/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow_hub/compressed_module_resolver.py in _call_urlopen(self, request)
     95     # Overriding this method allows setting SSL context in Python 3.
---> 96     return url.urlopen(request)
     97 

~/.pyenv/versions/3.7.7/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py in urlopen(url, data, timeout, cafile, capath, cadefault, context)
    221         opener = _opener
--> 222     return opener.open(url, data, timeout)
    223 

~/.pyenv/versions/3.7.7/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py in open(self, fullurl, data, timeout)
    530             meth = getattr(processor, meth_name)
--> 531             response = meth(req, response)
    532 

~/.pyenv/versions/3.7.7/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py in http_response(self, request, response)
    640             response = self.parent.error(
--> 641                 'http', request, response, code, msg, hdrs)
    642 

~/.pyenv/versions/3.7.7/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py in error(self, proto, *args)
    568             args = (dict, 'default', 'http_error_default') + orig_args
--> 569             return self._call_chain(*args)
    570 

~/.pyenv/versions/3.7.7/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py in _call_chain(self, chain, kind, meth_name, *args)
    502             func = getattr(handler, meth_name)
--> 503             result = func(*args)
    504             if result is not None:

~/.pyenv/versions/3.7.7/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py in http_error_default(self, req, fp, code, msg, hdrs)
    648     def http_error_default(self, req, fp, code, msg, hdrs):
--> 649         raise HTTPError(req.full_url, code, msg, hdrs, fp)
    650 

HTTPError: HTTP Error 404: Not Found

During handling of the above exception, another exception occurred:

HTTPError                                 Traceback (most recent call last)
<ipython-input-6-e85fa5c3071d> in <module>
----> 1 tokenizer = ConveRTTokenizer()

~/.pyenv/versions/3.7.7/Python.framework/Versions/3.7/lib/python3.7/site-packages/rasa/nlu/tokenizers/convert_tokenizer.py in __init__(self, component_config)
     31 
     32         model_url = "http://models.poly-ai.com/convert/v1/model.tar.gz"
---> 33         self.module = train_utils.load_tf_hub_model(model_url)
     34 
     35         self.tokenize_signature = self.module.signatures["tokenize"]

~/.pyenv/versions/3.7.7/Python.framework/Versions/3.7/lib/python3.7/site-packages/rasa/utils/train_utils.py in load_tf_hub_model(model_url)
    171         directory = io_utils.create_temporary_directory()
    172         os.environ["TFHUB_CACHE_DIR"] = directory
--> 173         return tfhub.load(model_url)
    174 
    175 

~/.pyenv/versions/3.7.7/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow_hub/module_v2.py in load(handle, tags)
     95   if not isinstance(handle, six.string_types):
     96     raise ValueError("Expected a string, got %s" % handle)
---> 97   module_path = resolve(handle)
     98   is_hub_module_v1 = tf.io.gfile.exists(
     99       native_module.get_module_proto_path(module_path))

~/.pyenv/versions/3.7.7/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow_hub/module_v2.py in resolve(handle)
     51     A string representing the Module path.
     52   """
---> 53   return registry.resolver(handle)
     54 
     55 

~/.pyenv/versions/3.7.7/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow_hub/registry.py in __call__(self, *args, **kwargs)
     40     for impl in reversed(self._impls):
     41       if impl.is_supported(*args, **kwargs):
---> 42         return impl(*args, **kwargs)
     43     raise RuntimeError(
     44         "Missing implementation that supports: %s(*%r, **%r)" % (

~/.pyenv/versions/3.7.7/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow_hub/compressed_module_resolver.py in __call__(self, handle)
     86 
     87     return resolver.atomic_download(handle, download, module_dir,
---> 88                                     self._lock_file_timeout_sec())
     89 
     90   def _lock_file_timeout_sec(self):

~/.pyenv/versions/3.7.7/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow_hub/resolver.py in atomic_download(handle, download_fn, module_dir, lock_file_timeout_sec)
    413     logging.info("Downloading TF-Hub Module '%s'.", handle)
    414     tf_v1.gfile.MakeDirs(tmp_dir)
--> 415     download_fn(handle, tmp_dir)
    416     # Write module descriptor to capture information about which module was
    417     # downloaded by whom and when. The file stored at the same level as a

~/.pyenv/versions/3.7.7/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow_hub/compressed_module_resolver.py in download(handle, tmp_dir)
     81       """Fetch a module via HTTP(S), handling redirect and download headers."""
     82       request = url.Request(_append_compressed_format_query(handle))
---> 83       response = self._call_urlopen(request)
     84       return resolver.DownloadManager(handle).download_and_uncompress(
     85           response, tmp_dir)

~/.pyenv/versions/3.7.7/Python.framework/Versions/3.7/lib/python3.7/site-packages/tensorflow_hub/compressed_module_resolver.py in _call_urlopen(self, request)
     94   def _call_urlopen(self, request):
     95     # Overriding this method allows setting SSL context in Python 3.
---> 96     return url.urlopen(request)
     97 
     98 

~/.pyenv/versions/3.7.7/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py in urlopen(url, data, timeout, cafile, capath, cadefault, context)
    220     else:
    221         opener = _opener
--> 222     return opener.open(url, data, timeout)
    223 
    224 def install_opener(opener):

~/.pyenv/versions/3.7.7/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py in open(self, fullurl, data, timeout)
    529         for processor in self.process_response.get(protocol, []):
    530             meth = getattr(processor, meth_name)
--> 531             response = meth(req, response)
    532 
    533         return response

~/.pyenv/versions/3.7.7/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py in http_response(self, request, response)
    639         if not (200 <= code < 300):
    640             response = self.parent.error(
--> 641                 'http', request, response, code, msg, hdrs)
    642 
    643         return response

~/.pyenv/versions/3.7.7/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py in error(self, proto, *args)
    567         if http_err:
    568             args = (dict, 'default', 'http_error_default') + orig_args
--> 569             return self._call_chain(*args)
    570 
    571 # XXX probably also want an abstract factory that knows when it makes

~/.pyenv/versions/3.7.7/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py in _call_chain(self, chain, kind, meth_name, *args)
    501         for handler in handlers:
    502             func = getattr(handler, meth_name)
--> 503             result = func(*args)
    504             if result is not None:
    505                 return result

~/.pyenv/versions/3.7.7/Python.framework/Versions/3.7/lib/python3.7/urllib/request.py in http_error_default(self, req, fp, code, msg, hdrs)
    647 class HTTPDefaultErrorHandler(BaseHandler):
    648     def http_error_default(self, req, fp, code, msg, hdrs):
--> 649         raise HTTPError(req.full_url, code, msg, hdrs, fp)
    650 
    651 class HTTPRedirectHandler(BaseHandler):

HTTPError: HTTP Error 404: Not Found

Command or request that led to error:

import rasa
from rasa.nlu.tokenizers.convert_tokenizer import ConveRTTokenizer
tokenizer = ConveRTTokenizer()

Or anything that requires loading the ConveRT model (including loading previously trained models that use the ConveRT TFHub model).

area type

Most helpful comment

I've created a PR updating the model URL: https://github.com/RasaHQ/rasa/pull/6742

I've also opened an issue on the PolyAI models repository asking them to 301 redirect the old model URLs to the new model URLs 馃檪

All 10 comments

I've created a PR updating the model URL: https://github.com/RasaHQ/rasa/pull/6742

I've also opened an issue on the PolyAI models repository asking them to 301 redirect the old model URLs to the new model URLs 馃檪

@connorbrinton Still the error persists.

Any updates on this?

I am having the same error.

Please update to 1.10.14. Bug fixed there.

@gbax Sorry but I'm confused, what should be updated to 1.10.14?

@gbax Sorry but I'm confused, what should be updated to 1.10.14?

I think he's talking about rasa 1.10.14
pip install rasa==1.10.14

https://github.com/PolyAI-LDN/polyai-models/releases/download/v1.0/model.tar.gz
also not available any more.

PloyAI just decided to take down the ConveRT models from the public domain:
https://github.com/PolyAI-LDN/polyai-models

Do we have any workaround?

The previously publicly available ConveRT models appear to have been licensed under the Apache 2.0 license, making redistribution permissible. If anyone has the official files for the ConveRT models, it would be great to have them redistributed under the same license here.

I've repackaged the loaded model I have running in production, and released it here (under the Apache 2.0 license):
https://github.com/connorbrinton/polyai-models/releases/tag/v1.0

@connorbrinton I tried using your repackaged model:
2020-09-29 13:11:05 INFO root - Starting Rasa server on http://localhost:3003
2020-09-29 13:11:10 INFO absl - Using /var/tmp/tfhub_modules to cache modules.
2020-09-29 13:11:10 INFO absl - Downloading TF-Hub Module 'https://github.com/PolyAI-LDN/polyai-models/releases/download/v1.0/model.tar.gz'.
2020-09-29 13:11:12 ERROR rasa.core.agent - Could not load model due to HTTP Error 404: Not Found.
[2020-09-29 13:11:12 +0530] [6] [ERROR] Experienced exception while trying to serve
Traceback (most recent call last):

But facing the same 404 error!!

@connorbrinton I tried using your repackaged model:
2020-09-29 13:11:05 INFO root - Starting Rasa server on http://localhost:3003
2020-09-29 13:11:10 INFO absl - Using /var/tmp/tfhub_modules to cache modules.
2020-09-29 13:11:10 INFO absl - Downloading TF-Hub Module 'https://github.com/PolyAI-LDN/polyai-models/releases/download/v1.0/model.tar.gz'.
2020-09-29 13:11:12 ERROR rasa.core.agent - Could not load model due to HTTP Error 404: Not Found.
[2020-09-29 13:11:12 +0530] [6] [ERROR] Experienced exception while trying to serve
Traceback (most recent call last):

But facing the same 404 error!!

You need to change https://github.com/PolyAI-LDN/polyai-models/releases/download/v1.0/model.tar.gz URL to https://github.com/connorbrinton/polyai-models/releases/download/v1.0/model.tar.gz in your pipeline.

Was this page helpful?
0 / 5 - 0 ratings