Fairseq: Omegaconf & hydra-core missing dependencies

Created on 30 Sep 2020 · 11Comments · Source: pytorch/fairseq

🐛 Bug

After installing torch, trying to load a model will fail due to two missing dependencies: omegaconf and hydra-core.

To Reproduce

Running the following code block doesn't work:

import torch
roberta = torch.hub.load("pytorch/fairseq", "roberta.large")
roberta.eval()
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"using device: {device}")
roberta.to(device)

The traceback I get is:

Traceback (most recent call last):
  File "/src/cortex/lib/type/predictor.py", line 111, in initialize_impl
    return class_impl(**args)
  File "/mnt/project/predictor.py", line 10, in __init__
    roberta = torch.hub.load("pytorch/fairseq", "roberta.large")
  File "/opt/conda/envs/env/lib/python3.6/site-packages/torch/hub.py", line 349, in load
    hub_module = import_module(MODULE_HUBCONF, repo_dir + '/' + MODULE_HUBCONF)
  File "/opt/conda/envs/env/lib/python3.6/site-packages/torch/hub.py", line 71, in import_module
    spec.loader.exec_module(module)
  File "<frozen importlib._bootstrap_external>", line 678, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/root/.cache/torch/hub/pytorch_fairseq_master/hubconf.py", line 8, in <module>
    from fairseq.hub_utils import BPEHubInterface as bpe  # noqa
  File "/root/.cache/torch/hub/pytorch_fairseq_master/fairseq/__init__.py", line 17, in <module>
    import fairseq.criterions  # noqa
  File "/root/.cache/torch/hub/pytorch_fairseq_master/fairseq/criterions/__init__.py", line 26, in <module>
    importlib.import_module('fairseq.criterions.' + module)
  File "/opt/conda/envs/env/lib/python3.6/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/root/.cache/torch/hub/pytorch_fairseq_master/fairseq/criterions/adaptive_loss.py", line 12, in <module>
    from fairseq.dataclass.data_class import DDP_BACKEND_CHOICES
  File "/root/.cache/torch/hub/pytorch_fairseq_master/fairseq/dataclass/data_class.py", line 12, in <module>
    from fairseq.tasks import TASK_DATACLASS_REGISTRY
  File "/root/.cache/torch/hub/pytorch_fairseq_master/fairseq/tasks/__init__.py", line 73, in <module>
    importlib.import_module('fairseq.tasks.' + task_name)
  File "/opt/conda/envs/env/lib/python3.6/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/root/.cache/torch/hub/pytorch_fairseq_master/fairseq/tasks/language_modeling.py", line 36, in <module>
    from omegaconf import II
ModuleNotFoundError: No module named 'omegaconf'

Then, if I add omegaconf to the requirements.txt file, the next error I get is:

Traceback (most recent call last):
  File "/src/cortex/lib/type/predictor.py", line 111, in initialize_impl
    return class_impl(**args)
  File "/mnt/project/predictor.py", line 10, in __init__
    roberta = torch.hub.load("pytorch/fairseq", "roberta.large")
  File "/opt/conda/envs/env/lib/python3.6/site-packages/torch/hub.py", line 349, in load
    hub_module = import_module(MODULE_HUBCONF, repo_dir + '/' + MODULE_HUBCONF)
  File "/opt/conda/envs/env/lib/python3.6/site-packages/torch/hub.py", line 71, in import_module
    spec.loader.exec_module(module)
  File "<frozen importlib._bootstrap_external>", line 678, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/root/.cache/torch/hub/pytorch_fairseq_master/hubconf.py", line 8, in <module>
    from fairseq.hub_utils import BPEHubInterface as bpe  # noqa
  File "/root/.cache/torch/hub/pytorch_fairseq_master/fairseq/__init__.py", line 17, in <module>
    import fairseq.criterions  # noqa
  File "/root/.cache/torch/hub/pytorch_fairseq_master/fairseq/criterions/__init__.py", line 26, in <module>
    importlib.import_module('fairseq.criterions.' + module)
  File "/opt/conda/envs/env/lib/python3.6/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "/root/.cache/torch/hub/pytorch_fairseq_master/fairseq/criterions/adaptive_loss.py", line 12, in <module>
    from fairseq.dataclass.data_class import DDP_BACKEND_CHOICES
  File "/root/.cache/torch/hub/pytorch_fairseq_master/fairseq/dataclass/data_class.py", line 18, in <module>
    from hydra.core.config_store import ConfigStore
ModuleNotFoundError: No module named 'hydra'

Adding hydra-core to the requirementst.txt fixes the issue. With both of them in, torch works as expected - one thing I've noticed is that when installing them, they go through a complex-looking stage of compiling stuff with g++ (takes some time).

Expected behavior

Expected it to work without having to install omegaconf and hydra-core. The provided example worked without these a couple of months ago - maybe something did change in torch?

Environment

Installed torch the following way:

pip install --no-cache-dir --find-links https://download.pytorch.org/whl/torch_stable.html torch==1.6.0+cu101 torchvision==0.7.0+cu101

OS (e.g., Linux): Linux.
Python version: 3.6.9.
CUDA/cuDNN version: 10.1
GPU models and configuration: A single T4 GPU.

Additional context

As fixed in https://github.com/cortexlabs/cortex/pull/1402. Suggested by @omry to post here.

bug

Source

RobertLucian

All 11 comments

@RobertLucian, the stack trace suggests that you do have version of fairseq installed that depends on Hydra.
Can you show a complete repro including how you create the environment?
(say, a clean conda environment).

omry on 30 Sep 2020

Thanks for flagging, torch.hub uses its own mechanism for specifying dependencies, and we need to add hydra there: https://github.com/pytorch/fairseq/blob/master/hubconf.py#L13-L18

myleott on 30 Sep 2020

Thanks @myleott, I spotted it at the repo root and I thought it might be the case.

omry on 30 Sep 2020

@myleott thanks a lot for your quick resolution on this!
@omry in this case, is a complete repro including the process of creating the environment still necessary?

RobertLucian on 1 Oct 2020

No need, as we have identified the root cause.

omry on 1 Oct 2020

👍1

@RobertLucian, that that we are clearing the unneeded dependency of Hydra in Cortex - I recommend that you take a look anyway and consider using it.

https://hydra.cc

omry on 2 Oct 2020

Thanks @omry, I'll have a look at it.

RobertLucian on 2 Oct 2020

🚀1

This should be fixed by f902a363abc578906f29239f995cacce5e93a807

myleott on 2 Oct 2020

🎉1

@myleott thanks for the fix.

Any idea when this is gonna land (or is it just a patch and is already available)?
Also, is there an ETA of when this will make it in torch (as a dependency)?

RobertLucian on 2 Oct 2020

Actually that didn't totally fix it 😕 Just to clarify, it seems hydra-core will be a required dependency for torch.hub usage going forward.

Any idea when this is gonna land (or is it just a patch and is already available)?
Also, is there an ETA of when this will make it in torch (as a dependency)?

When it lands it will take effect immediately, you'll just need to do torch.hub.load(..., force_reload=True).

myleott on 3 Oct 2020

This should be fixed. Please use torch.hub.load(..., force_reload=True). Note that dataclasses and hydra-core are now required dependencies, so you'll need to pip install them if you haven't already.

myleott on 3 Oct 2020

Was this page helpful?

0 / 5 - 0 ratings