Jax: Compiling JAX from inside a Dockerfile fails

Created on 6 Sep 2019  路  4Comments  路  Source: google/jax

I was able to compile JAX interactively on a terminal. Now I am trying to install it from within the Dockerfile using the following commands.

RUN git clone https://github.com/google/jax.git /jax
WORKDIR "/jax"
RUN python build/build.py --enable_cuda
RUN pip install -e build
RUN pip install -e .

However, I run into the following error. It appears to be due to some interaction between docker and python subprocess.

## STDOUT

Target //build:install_xla_in_source_tree up-to-date:
bazel-bin/build/install_xla_in_source_tree
INFO: Elapsed time: 518.641s, Critical Path: 151.85s
INFO: 5040 processes: 5040 local.
INFO: Build completed successfully, 6979 total actions
INFO: Running command line: bazel-bin/build/install_xla_in_source_tree /jax/build
INFO: Build completed successfully, 6979 total actions
WARNING: Waiting for server process to terminate (waited 5 seconds, waiting at most 60)
WARNING: Waiting for server process to terminate (waited 10 seconds, waiting at most 60)
WARNING: Waiting for server process to terminate (waited 30 seconds, waiting at most 60)
INFO: Waited 60 seconds for server process (pid=24) to terminate.
WARNING: Waiting for server process to terminate (waited 5 seconds, waiting at most 10)
WARNING: Waiting for server process to terminate (waited 10 seconds, waiting at most 10)
INFO: Waited 10 seconds for server process (pid=24) to terminate.
FATAL: Attempted to kill stale server process (pid=24) using SIGKILL, but it did not die in a timely fashion.
bazel-0.24.1-linux-x86_64 [########################################] 100%
Bazel binary path: ./bazel-0.24.1-linux-x86_64
Python binary path: /opt/conda/bin/python
MKL-DNN enabled: yes
-march=native: no
CUDA enabled: yes

Building XLA and installing it in the jaxlib source tree...
Traceback (most recent call last):
File "build/build.py", line 342, in
main()
File "build/build.py", line 338, in main
shell([bazel_path, "shutdown"])
File "build/build.py", line 50, in shell
output = subprocess.check_output(cmd)
File "/opt/conda/lib/python3.6/subprocess.py", line 356, in check_output
**kwargs).stdout
File "/opt/conda/lib/python3.6/subprocess.py", line 438, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['./bazel-0.24.1-linux-x86_64', 'shutdown']' returned non-zero exit status 36.
The command '/bin/sh -c python build/build.py --enable_cuda' returned a non-zero code: 1

#

Most helpful comment

Hi, I ran into the same issue and noticed that the reason is because docker hides the PID from the host by default, Bazel is unable to terminate the session, as can be seen from the line

FATAL: Attempted to kill stale server process (pid=24) using SIGKILL ...

Therefore, in order to build JAX inside docker. We need to expose the PID with the option --pid=host.

All 4 comments

I wonder if it would be helpful to look at the Dockerfile we use for building all the wheels on Cloud machines. At the very least, that one works!

Were you able to get your Docker file to work?

Since our Docker builds are currently working, I think it's possible that this bug you're seeing is not in JAX's scope. What do you think?

I used a work around for now i.e. compiling interactively and then doing a docker commit. I did not try modifying the Dockerfile itself. I looked at your build scripts and may be the trick is to use "ENTRYPOINT" and point it to a shell script where the compilation actually happens. For now please go ahead and close the bug as I don't believe its in JAX's scope.

Hi, I ran into the same issue and noticed that the reason is because docker hides the PID from the host by default, Bazel is unable to terminate the session, as can be seen from the line

FATAL: Attempted to kill stale server process (pid=24) using SIGKILL ...

Therefore, in order to build JAX inside docker. We need to expose the PID with the option --pid=host.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

shannon63 picture shannon63  路  3Comments

sursu picture sursu  路  3Comments

asross picture asross  路  3Comments

fehiepsi picture fehiepsi  路  3Comments

zhongwen picture zhongwen  路  3Comments