Mujoco-py: Build error when used with Singularity

Created on 9 Apr 2020  路  18Comments  路  Source: openai/mujoco-py

Nov 30 2020: Now, there is a solution for this. Please refer to my last comment.

Short Description: When mujoco_py is imported within a Singularity container, an error occurs.

Details: When mujoco_py is imported within a Singularity container, an error occurs at Line 89 of builder.py with fasteners.InterProcessLock(lockpath) with an error message "OSError: [Error30] Read-only file system".

This error occurs because fasteners.InterProcessLock(lockpath) at Line 89 of builder.py tries to make a lock file under "/usr/local/lib/python3.6/dist-packages/mujoco_py-2.0.2.9-py3.6.egg/mujoco_py/generated/", which is read-only. Lines around this seem to be related to the compilation and loading of mujoco_py cython extensions. However, this occurs even if mujoco_py has been built in container build time.

Whole Error Message:
Traceback (most recent call last):
File "", line 1, in
File "/usr/local/lib/python3.6/dist-packages/mujoco_py-2.0.2.9-py3.6.egg/mujoco_py/__init__.py", line 3, in
from mujoco_py.builder import cymj, ignore_mujoco_warnings, functions, MujocoException
File "/usr/local/lib/python3.6/dist-packages/mujoco_py-2.0.2.9-py3.6.egg/mujoco_py/builder.py", line 510, in
cymj = load_cython_ext(mujoco_path)
File "/usr/local/lib/python3.6/dist-packages/mujoco_py-2.0.2.9-py3.6.egg/mujoco_py/builder.py", line 89, in load_cython_ext
with fasteners.InterProcessLock(lockpath):
File "/usr/local/lib/python3.6/dist-packages/fasteners/process_lock.py", line 179, in __enter__
gotten = self.acquire()
File "/usr/local/lib/python3.6/dist-packages/fasteners/process_lock.py", line 156, in acquire
self._do_open()
File "/usr/local/lib/python3.6/dist-packages/fasteners/process_lock.py", line 128, in _do_open
self.lockfile = open(self.path, 'a')
OSError: [Errno 30] Read-only file system: b'/usr/local/lib/python3.6/dist-packages/mujoco_py-2.0.2.9-py3.6.egg/mujoco_py/generated/mujocopy-buildlock'

Environment:

  • OS: Ubuntu 18.04
  • Python Version: 3.6.8
  • Mujoco Version: 2.00
  • mujoco-py version: 2.0.2.5

How to Reproduce: Build a Singularity image with the following def file, and then, run python -c "import mujoco_py" in the container.

----- mujoco.def -----

From: tensorflow/tensorflow:1.15.0-gpu-py3-jupyter

%files
./mjkey.txt /opt/mjkey.txt

%environment
export LANG=C.UTF-8
export LD_LIBRARY_PATH=/opt/mujoco200/bin:${LD_LIBRARY_PATH}
export LD_LIBRARY_PATH=/usr/local/nvidia/lib64:${LD_LIBRARY_PATH}
export MUJOCO_PY_MUJOCO_PATH=/opt/mujoco200
export MUJOCO_PY_MJKEY_PATH=/opt/mujoco200/mjkey.txt

%post
export LD_LIBRARY_PATH=/opt/mujoco200/bin:${LD_LIBRARY_PATH}

apt-get update -q \
  && DEBIAN_FRONTEND=noninteractive apt-get install -y \
  curl \
  git \
  libgl1-mesa-dev \
  libgl1-mesa-glx \
  libglew-dev \
  libosmesa6-dev \
  software-properties-common \
  net-tools \
  unzip \
  vim \
  wget \
  xpra \
  xserver-xorg-dev

  apt-get clean
  rm -rf /var/lib/apt/lists/*

# --- Installing PatchELF ---
curl -o /usr/local/bin/patchelf https://s3-us-west-2.amazonaws.com/openai-sci-artifacts/manual-builds/patchelf_0.9_amd64.elf
chmod +x /usr/local/bin/patchelf

# --- Installing MuJoCo ---

cd /opt
wget https://www.roboti.us/download/mujoco200_linux.zip -O mujoco.zip
unzip mujoco.zip -d /opt
mv /opt/mujoco200_linux /opt/mujoco200
mv /opt/mjkey.txt /opt/mujoco200
rm mujoco.zip

export MUJOCO_PY_MUJOCO_PATH=/opt/mujoco200
export MUJOCO_PY_MJKEY_PATH=/opt/mujoco200/mjkey.txt

# --- Installing mujoco-py ---

cd /opt
git clone https://github.com/openai/mujoco-py.git
cd mujoco-py
pip install --no-cache-dir -r requirements.txt
pip install --no-cache-dir -r requirements.dev.txt
python setup.py build install
cd ..

# Initial build of mujoco_py.
python -c 'import mujoco_py'

All 18 comments

An issue is that mujoco-py tries to lock files even if it is unnecessary. Possible workaround might be rewriting Lines 87 to 106 of builder.py to the following.

    lockpath = os.path.join(os.path.dirname(cext_so_path), 'mujocopy-buildlock')

    mod = None
    force_rebuild = os.environ.get('MUJOCO_PY_FORCE_REBUILD')
    if force_rebuild:
        with fasteners.InterProcessLock(lockpath):
            # Try to remove the old file, ignore errors if it doesn't exist
            print("Removing old mujoco_py cext", cext_so_path)
            try:
                os.remove(cext_so_path)
            except OSError:
                pass
    if exists(cext_so_path):
        try:
            mod = load_dynamic_ext('cymj', cext_so_path)
        except ImportError:
            print("Import error. Trying to rebuild mujoco_py.")
    if mod is None:
        with fasteners.InterProcessLock(lockpath):
            cext_so_path = builder.build()
            mod = load_dynamic_ext('cymj', cext_so_path)

I experienced the same issue when packaging mujoco for the arch user repository. The patch I applied (lines 27 - 33).

Thank you Eric for sharing the patch. It is definitely useful.

For people looking for a quick workaround, use the following:

git clone https://github.com/tadashiK/mujoco-py
cp ~/.mujoco/mjkey.txt ./
cd mujoco-py
sudo singularity build tf-2.1.sif tf-2.1.def
rm cd ../mjkey.txt

Updated on Nov 30 2020 based on btjanaka's idea. Now, please use the following:

git clone https://github.com/tadashiK/mujoco-py
cp ~/.mujoco/mjkey.txt mujoco-py/singularity
cd mujoco-py/singularity
sudo singularity build tf-2.1.sif tf-2.1.def
rm cd mjkey.txt

@edlanglois the patch is not found under your link.

@piojanu I've updated the link. In case it breaks in the future my change is the following:
Below cext_so_path = builder.get_so_file_path() in builder.py

   # Check if we have write access to the cext_so_path.
   # If not, it's probably because mujoco-py has been installed and everything is
   # read-only. Returning here is necessary because the lock creation will fail.
   # It might be better to try-catch the lock but this minimizes the diff complexity.
   if not os.access(os.path.dirname(cext_so_path), os.W_OK):
       return load_dynamic_ext('cymj', cext_so_path)

This assumes the mujoco_py extension object file has already been built, presumably during the install process.

Would it work to point the lockpath to a directory that is not read-only, such as /tmp?

@btjanaka I think it would work although I have not tried it.

@btjanaka I tested and confirmed it works. One minor thing I note is that you still need to build mujoco-py during the build of a Singularity image. To do so, a Singularity definition file needs python3 -c 'import mujoco_py' when installing mujoco-py by pip install 'mujoco-py<2.1,>=2.0'. It is unnecessary, though, when installing mujoco-py by

git clone https://github.com/openai/mujoco-py.git
cd mujoco-py
pip install --no-cache-dir -r requirements.txt
pip install --no-cache-dir -r requirements.dev.txt
python setup.py build install

For people looking for a quick workaround

Please use the following:

git clone https://github.com/tadashiK/mujoco-py
cp ~/.mujoco/mjkey.txt mujoco-py/singularity
cd mujoco-py/singularity
sudo singularity build tf-2.1.sif tf-2.1.def
rm mjkey.txt

If necessary, modify tf-2.1.def in mujoco-py/singularity folder in my fork of mujoco-py.

For people who want to know how to avoid this issue

First, change this line of builder.py to something like this. ('/tmp' can be any other writable directory.)

Second, write your own Singularity definition file. (An example can be found here.) Note that you need to build mujoco-py during the build of a Singularity image. To do so, a Singularity definition file needs python3 -c 'import mujoco_py' when installing mujoco-py by pip install 'mujoco-py<2.1,>=2.0'. It is unnecessary when installing mujoco-py by

git clone https://github.com/openai/mujoco-py.git
cd mujoco-py
pip install --no-cache-dir -r requirements.txt
pip install --no-cache-dir -r requirements.dev.txt
python setup.py build install

Note that after the installation, you need to replace mujoco-py/mujoco_py/builder.py in a site-packages directory with the modified one explained above. Or make your own fork of mujoco-py, modify builder.py as above, and use it.

@tadashiK Thank you for your efforts into making it possible to compile/use MuJoCo in a Singularity container.
I have a question for you. I wanted to ask my question in your repo but your repo's issues are closed so I ask here. I wonder, do I have to change anything in your repo in order to compile MuJoCo in a container with CUDA 11.x? I'm asking this because your container seems to be using CUDA 10.x and I'm not sure if any of the modifications you have in your repo would only work on CUDA 10.x.

@tadashiK I just built mujoco_py using your fork of mujoco-py and by following all of the steps you took for building your Singularity container. However, I am still getting the following errors. Do you know what could I be doing wrong?

Note that line 87 of /usr/local/lib/python3.7/dist-packages/mujoco_py-2.0.2.13-py3.7.egg/mujoco_py/builder.py has been changed to lockpath = os.path.join('/tmp', 'mujocopy-buildlock') since I am using your fork.

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.7/dist-packages/mujoco_py-2.0.2.13-py3.7.egg/mujoco_py/__init__.py", line 3, in <module>
    from mujoco_py.builder import cymj, ignore_mujoco_warnings, functions, MujocoException
  File "/usr/local/lib/python3.7/dist-packages/mujoco_py-2.0.2.13-py3.7.egg/mujoco_py/builder.py", line 510, in <module>
    cymj = load_cython_ext(mujoco_path)
  File "/usr/local/lib/python3.7/dist-packages/mujoco_py-2.0.2.13-py3.7.egg/mujoco_py/builder.py", line 89, in load_cython_ext
    with fasteners.InterProcessLock(lockpath):
  File "/usr/local/lib/python3.7/dist-packages/fasteners/process_lock.py", line 158, in __enter__
    gotten = self.acquire()
  File "/usr/local/lib/python3.7/dist-packages/fasteners/process_lock.py", line 135, in acquire
    self._do_open()
  File "/usr/local/lib/python3.7/dist-packages/fasteners/process_lock.py", line 107, in _do_open
    self.lockfile = open(self.path, 'a')
PermissionError: [Errno 13] Permission denied: b'/tmp/mujocopy-buildlock'

CUDA version should not be a problem, I think. I am using a newer version of Tensorflow docker image as a base image and have not encountered any issue.

As for the "permission denied issue", would you clean up /tmp? I remember that I encountered the same issue before, but I forgot how I managed to solve it... As far as I remember, it was caused by a very simple reason like you have mujocopy-buildlock left in /tmp for some reason.

I just tried the following definition file and did not encounter any issue.

BootStrap: docker
From: tensorflow/tensorflow:2.4.0-jupyter

%files
    # mjkey.txt is necessary for the initial build of mujoco_py.
    # Later, the copied key will be removed so that nobody
    # accidentally uploads the key to Singularity hub.
    mjkey.txt /opt/mjkey.txt

%environment
    export LD_LIBRARY_PATH=/opt/mujoco200/bin:${LD_LIBRARY_PATH}

    # For mujoco-py
    export MUJOCO_PY_MUJOCO_PATH=/opt/mujoco200
    export MUJOCO_PY_MJKEY_PATH=${HOME}/.mujoco/mjkey.txt

%post
    export WORKDIR=/opt

    # For mujoco-py
    export MUJOCO_PY_MUJOCO_PATH=$WORKDIR/mujoco200
    export MUJOCO_PY_MJKEY_PATH=$WORKDIR/mjkey.txt

    apt-get update -q && \
    DEBIAN_FRONTEND=noninteractive apt-get install -y \
        git \
        libopenmpi-dev \
        net-tools \
        software-properties-common \
        unzip \
        wget \
        vim

    python3 -m pip install --upgrade pip

    # --- Clone a forked mujoco-py ---
    git clone https://github.com/tadashiK/mujoco-py.git $WORKDIR/mujoco-py

    # --- Install mujoco-py ---
    bash $WORKDIR/mujoco-py/singularity/installer.sh $WORKDIR
    python3 -m pip install -q gym

    # --- Cleaning up ---
    apt-get -y clean
    rm -rf /var/lib/apt/lists/*

@tadashiK Sorry I forgot to mention that I am not using a TensorFlow docker image. I am building my own container and pull an NVIDIA Cuda image for Ubuntu 18.04. Apparently /tmp/mujoco-buildlock is writable when you build the container by pulling the TensorFlow docker image.

Update: I just learned that /tmp/mujoco-buildlock and basically any directory in /tmp do not have write permissions in the container I've made. I changed the permission of /tmp/mujoco-buildlock and things seem to be working fine now as import mujoco_py did no throw an error.

@tadashiK When I built mujoco-py to run on GPUs and did import mujoco_py I get the following errors which are very similar to the original error you were getting before. However, the path change solution does not seem to be as simple as the solution you proposed. Do you have an idea on how I should probably resolve this as well?

import mujoco_py
running build_ext
building 'mujoco_py.cymj' extension
creating /usr/local/lib/python3.7/dist-packages/mujoco_py-2.0.2.13-py3.7.egg/mujoco_py/generated/_pyxbld_2.0.2.13_37_linuxgpuextensionbuilder
Traceback (most recent call last):
  File "/usr/lib/python3.7/distutils/dir_util.py", line 70, in mkpath
    os.mkdir(head, mode)
OSError: [Errno 30] Read-only file system: '/usr/local/lib/python3.7/dist-packages/mujoco_py-2.0.2.13-py3.7.egg/mujoco_py/generated/_pyxbld_2.0.2.13_37_linuxgpuextensionbuilder'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.7/dist-packages/mujoco_py-2.0.2.13-py3.7.egg/mujoco_py/__init__.py", line 3, in <module>
    from mujoco_py.builder import cymj, ignore_mujoco_warnings, functions, MujocoException
  File "/usr/local/lib/python3.7/dist-packages/mujoco_py-2.0.2.13-py3.7.egg/mujoco_py/builder.py", line 510, in <module>
    cymj = load_cython_ext(mujoco_path)
  File "/usr/local/lib/python3.7/dist-packages/mujoco_py-2.0.2.13-py3.7.egg/mujoco_py/builder.py", line 105, in load_cython_ext
    cext_so_path = builder.build()
  File "/usr/local/lib/python3.7/dist-packages/mujoco_py-2.0.2.13-py3.7.egg/mujoco_py/builder.py", line 221, in build
    built_so_file_path = self._build_impl()
  File "/usr/local/lib/python3.7/dist-packages/mujoco_py-2.0.2.13-py3.7.egg/mujoco_py/builder.py", line 291, in _build_impl
    so_file_path = super()._build_impl()
  File "/usr/local/lib/python3.7/dist-packages/mujoco_py-2.0.2.13-py3.7.egg/mujoco_py/builder.py", line 244, in _build_impl
    dist.run_commands()
  File "/usr/lib/python3.7/distutils/dist.py", line 966, in run_commands
    self.run_command(cmd)
  File "/usr/lib/python3.7/distutils/dist.py", line 985, in run_command
    cmd_obj.run()
  File "/usr/local/lib/python3.7/dist-packages/Cython/Distutils/old_build_ext.py", line 186, in run
    _build_ext.build_ext.run(self)
  File "/usr/lib/python3.7/distutils/command/build_ext.py", line 340, in run
    self.build_extensions()
  File "/usr/local/lib/python3.7/dist-packages/mujoco_py-2.0.2.13-py3.7.egg/mujoco_py/builder.py", line 144, in build_extensions
    build_ext.build_extensions(self)
  File "/usr/local/lib/python3.7/dist-packages/Cython/Distutils/old_build_ext.py", line 195, in build_extensions
    _build_ext.build_ext.build_extensions(self)
  File "/usr/lib/python3.7/distutils/command/build_ext.py", line 449, in build_extensions
    self._build_extensions_serial()
  File "/usr/lib/python3.7/distutils/command/build_ext.py", line 474, in _build_extensions_serial
    self.build_extension(ext)
  File "/usr/lib/python3.7/distutils/command/build_ext.py", line 534, in build_extension
    depends=ext.depends)
  File "/usr/lib/python3.7/distutils/ccompiler.py", line 566, in compile
    depends, extra_postargs)
  File "/usr/lib/python3.7/distutils/ccompiler.py", line 348, in _setup_compile
    self.mkpath(os.path.dirname(obj))
  File "/usr/lib/python3.7/distutils/ccompiler.py", line 916, in mkpath
    mkpath(name, mode, dry_run=self.dry_run)
  File "/usr/lib/python3.7/distutils/dir_util.py", line 74, in mkpath
    "could not create '%s': %s" % (head, exc.args[-1]))
distutils.errors.DistutilsFileError: could not create '/usr/local/lib/python3.7/dist-packages/mujoco_py-2.0.2.13-py3.7.egg/mujoco_py/generated/_pyxbld_2.0.2.13_37_linuxgpuextensionbuilder': Read-only file system

Would you give me more information? For example, would you give me the singularity definition file you used? Also, would you tell me what you did exactly? To understand what is going on, I need to reproduce your error on my local compute. Thank you in advance!

@tadashiK I basically changed your Singularity definition and added a subset of the solutions posted in #408 to enable headless GPU rendering with MuJoCo. Those changes include adding the environmental variable export LD_PRELOAD=/opt/mujoco200/bin/libglewegl.so:${LD_PRELOAD} doing mkdir -p /usr/lib/nvidia-000, export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib/nvidia-000 and export LD_PRELOAD=/usr/lib/nvidia-xxx/libEGL.so.1 (libEGL.so.1's path might be different for you)

Ah, I see. So, you are trying to render the env. Then, I am sorry that I cannot help you... I usually run experiments on a cluster using singularity, and check results locally. Therefore, I am not sure how to enable rendering on singularity.

Anyway, would you mind if I ask you to the singularity definition file you use? I also would like to try some tweaks to enable the rendering.

@tadashiK As I mentioned I used your Singularity image with the changes above but I actually did not include export LD_PRELOAD=/usr/lib/nvidia-xxx/libEGL.so.1 when I built the container and got those errors I posted above.

What do you mean by "you are trying to render the env?" The errors I am getting have nothing to do with rendering as they are thrown when I do import mujoco_py. If you read the error messages you can see that those error messages are pretty much exactly like what you were getting before. However, I could not figure out how the path is being constructed and passed to os.mkdir() so that I can change that path and store the newly-created files somewhere on /tmp instead of /usr/local/lib/python3.7/dist-packages/mujoco_py-2.0.2.13-py3.7.egg/mujoco_py/generated/, which is read-only.

Was this page helpful?
0 / 5 - 0 ratings