Bazel: Bazel cannot build itself in arm64 docker container

Created on 16 Jan 2019  路  27Comments  路  Source: bazelbuild/bazel

Description of the problem / feature request:

Bazel builds successfully but the compile.sh proceeds to build Bazel with Bazel and that fails. This is likely a bug in qemu, but I don't know where to get enough detail out of Bazel to investigate it any further.

Feature requests: what underlying problem are you trying to solve with this feature?

ERROR: /root/bazel-0.21.0/src/BUILD:109:1: PythonZipper src/create_embedded_tools.zip failed (Exit 255): zipper failed: error executing command
  (cd /tmp/bazel_AM1PL7Q9/out/execroot/io_bazel && \
  exec env - \
    PATH=/bin:/usr/bin \
  bazel-out/host/bin/external/bazel_tools/third_party/ijar/zipper cC bazel-out/host/bin/src/create_embedded_tools.zip @bazel-out/host/bin/src/create_embedded_tools.zip-0.params)
Execution platform: @bazel_tools//platforms:host_platform
Target //src:bazel_nojdk failed to build
INFO: Elapsed time: 6149.498s, Critical Path: 1611.76s
INFO: 1641 processes: 1404 local, 237 worker.
FAILED: Build did NOT complete successfully

ERROR: Could not build Bazel

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

Dockerfile contents

FROM multiarch/ubuntu-core:arm64-bionic as base

RUN apt-get update -y
RUN DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
  python3 python3-dev python3-pip 

RUN pip3 install --upgrade pip

FROM base as bazel

RUN DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
  build-essential openjdk-8-jdk zip unzip wget python

WORKDIR /root

RUN wget https://github.com/bazelbuild/bazel/releases/download/0.21.0/bazel-0.21.0-dist.zip
RUN mkdir bazel-0.21.0
RUN unzip bazel-0.21.0-dist.zip -d bazel-0.21.0

WORKDIR /root/bazel-0.21.0

From a terminal running in the directory where Dockerfile is

$ docker run --rm --privileged multiarch/qemu-user-static:register --reset
$ docker build -t multiarch-bld .
$ docker run -it multiarch-bld
# ./compile.sh

What operating system are you running Bazel on?

$ uname -a
Darwin Seans-MacBook-Pro.local 18.2.0 Darwin Kernel Version 18.2.0: Mon Nov 12 20:24:46 PST 2018; root:xnu-4903.231.4~2/RELEASE_X86_64 x86_64
$ docker --version
Docker version 18.09.1, build 4c52b90

What's the output of bazel info release?

None, it didn't build.

Any other information, logs, or outputs that you want to share?

When the above is run on an arm64 architecture (nVidia Jetson TX2 in my case), you can skip running the qemu-user-static container and it will build successfully.

P2 area-EngProd team-XProduct feature request

Most helpful comment

Thanks for the comments here.

@ArielleA and me will try to get our ARM port effort rebooted soon and will see what we can do here!

All 27 comments

Same on qemu 3.0.0

Any update on this? I was building from source on a x86 QEMU 3.0.0 environment. The target system is armv7l. The same problem happens for me after bazel release 0.5.4 (0.5.0-0.5.3 work fine). It is exactly the same error related to PythonZipper; I don't know what that is.

Chiming in, same issue on raspian stretch, running in qemu

ERROR: /build/bazel/src/BUILD:109:1: PythonZipper src/create_embedded_tools.zip failed (Exit 255): zipper failed: error executing command 
  (cd /tmp/bazel_afRBZBxm/out/execroot/io_bazel && \
  exec env - \
    PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \
  bazel-out/host/bin/external/bazel_tools/third_party/ijar/zipper cC bazel-out/host/bin/src/create_embedded_tools.zip @bazel-out/host/bin/src/create_embedded_tools.zip-0.params)
Target //src:bazel_nojdk failed to build

This seems to be a qemu issue. I bootstrapped on an Amazon Graviton and it builds fine.

@yunqu that's great news, would you mind share :)

@AleksandarFilipov Check this: https://gist.github.com/yunqu/e7fe12f44953deb1e0cf7b2c68362357

My target platforms are ultra96 and zcu104, both of which are aarch64 with Ubuntu 18.04. This flow produces a bazel binary working nicely with both of my platforms.

@yunqu, thanks for sharing, however I'm looking for a arm7 version. I'm not familiar with ultra96 nor zcu104. My target is raspian.

@AleksandarFilipov There is no 32-bit platform on AWS so there is almost no hope there. You can compile bazel 0.5.3 and earlier versions using QEMU but not latest versions.

Hi all,

I think I came a little bit further by installing zipper with "pip3 install zipper".

ERROR: /build/bazel/src/BUILD:237:1: Executing genrule //src:embedded_tools_nojdk failed (Exit 127): bash failed: error executing command (cd /tmp/bazel_277rLldt/out/execroot/io_bazel && \ exec env - \ PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \ /bin/bash -c 'source external/bazel_tools/tools/genrule/genrule-setup.sh; bazel-out/host/bin/src/create_embedded_tools "bazel-out/arm-opt/genfiles/src/embedded_tools_nojdk.zip" bazel-out/arm-opt/bin/src/embedded_tools_nojdk.params') /usr/bin/env: 'python': No such file or directory Target //src:bazel_nojdk failed to build

...but if I try running "/usr/bin/env python" it seems to be fine. What do bazel expect in this case? Is it python 3?

I got a little further by adding the line "ENV PYTHON_BIN_PATH=/usr/bin/python" to the Dockerfile:

ERROR: /build/bazel/src/BUILD:237:1: Executing genrule //src:embedded_tools_nojdk failed (Exit 1): bash failed: error executing command (cd /tmp/bazel_y2M0aR14/out/execroot/io_bazel && \ exec env - \ PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \ /bin/bash -c 'source external/bazel_tools/tools/genrule/genrule-setup.sh; bazel-out/host/bin/src/create_embedded_tools "bazel-out/arm-opt/genfiles/src/embedded_tools_nojdk.zip" bazel-out/arm-opt/bin/src/embedded_tools_nojdk.params') Traceback (most recent call last): File "bazel-out/host/bin/src/create_embedded_tools", line 213, in <module> Main() File "bazel-out/host/bin/src/create_embedded_tools", line 175, in Main 'Cannot exec() %r: file not found.' % main_filename AssertionError: Cannot exec() '/tmp/bazel_y2M0aR14/out/execroot/io_bazel/bazel-out/host/bin/src/create_embedded_tools.runfiles/io_bazel/src/create_embedded_tools.py': file not found. Target //src:bazel_nojdk failed to build

Thanks for the comments here.

@ArielleA and me will try to get our ARM port effort rebooted soon and will see what we can do here!

Thanks @philwo @ArielleA - if you need any help/encouragement/resources for this task happy to do what I can!

There is a QEMU 4.0.0 since this was last tried, @yunqu

I have tried QEMU 4.0.0 it is still the same error.

I just built bazel twice in docker containers using env EXTRA_BAZEL_ARGS="--host_javabase=@local_jdk//:jdk" bash ./compile.sh rather than just ./compile.sh

Can you try using env EXTRA_BAZEL_ARGS="--host_javabase=@local_jdk//:jdk" bash ./compile.sh please?

@ArielleA et al,
Please don't forget about arm32.

@ArielleA I updated the Dockerfile above to use 0.26.0 and used the command you mentioned and got the following:

root@5003c351198e:~/bazel-0.26.0# env EXTRA_BAZEL_ARGS="--host_javabase=@local_jdk//:jdk" bash ./compile.sh
馃崈  Building Bazel from scratch......
馃崈  Building Bazel with Bazel.
.WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
DEBUG: /tmp/bazel_wCbbyy8U/out/external/build_bazel_rules_nodejs/internal/common/check_bazel_version.bzl:49:5:
Current Bazel is not a release version, cannot check for compatibility.
DEBUG: /tmp/bazel_wCbbyy8U/out/external/build_bazel_rules_nodejs/internal/common/check_bazel_version.bzl:51:5: Make sure that you are running at least Bazel 0.17.1.
INFO: Analyzed target //src:bazel_nojdk (211 packages loaded, 10406 targets configured).
INFO: Found 1 target...
ERROR: /root/bazel-0.26.0/src/BUILD:113:1: PythonZipper src/create_embedded_tools.zip failed (Exit 255): zipper failed: error executing command
  (cd /tmp/bazel_wCbbyy8U/out/execroot/io_bazel && \
  exec env - \
    PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \
  bazel-out/host/bin/external/bazel_tools/third_party/ijar/zipper cC bazel-out/host/bin/src/create_embedded_tools.zip @bazel-out/host/bin/src/create_embedded_tools.zip-0.params)
Execution platform: //:default_host_platform
Target //src:bazel_nojdk failed to build
INFO: Elapsed time: 1466.605s, Critical Path: 118.23s
INFO: 339 processes: 339 local.
FAILED: Build did NOT complete successfully

ERROR: Could not build Bazel

I didn't install zipper as @enmasse had suggested. I will try that next and see if it makes a difference.

No luck, appears to be the same error even with zipper installed.

root@a7ee8833b96c:~/bazel-0.26.0# pip3 install setuptools
Collecting setuptools
  Downloading https://files.pythonhosted.org/packages/ec/51/f45cea425fd5cb0b0380f5b0f048ebc1da5b417e48d304838c02d6288a1e/setuptools-41.0.1-py2.py3-none-any.whl (575kB)
     |################################| 583kB 57kB/s
Installing collected packages: setuptools
Successfully installed setuptools-41.0.1
root@a7ee8833b96c:~/bazel-0.26.0# pip3 install zipper
Collecting zipper
  Downloading https://files.pythonhosted.org/packages/61/2b/33247f55ec79f2805309d164f9fceb61c49dace3cdb6c528528a59ba9e3e/zipper-0.0.3.tar.gz
Installing collected packages: zipper
  Running setup.py install for zipper ... done
Successfully installed zipper-0.0.3
root@a7ee8833b96c:~/bazel-0.26.0# env EXTRA_BAZEL_ARGS="--host_javabase=@local_jdk//:jdk" bash ./compile.sh
馃崈  Building Bazel from scratch......
馃崈  Building Bazel with Bazel.
.WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
DEBUG: /tmp/bazel_EFqfvH2K/out/external/build_bazel_rules_nodejs/internal/common/check_bazel_version.bzl:49:5:
Current Bazel is not a release version, cannot check for compatibility.
DEBUG: /tmp/bazel_EFqfvH2K/out/external/build_bazel_rules_nodejs/internal/common/check_bazel_version.bzl:51:5: Make sure that you are running at least Bazel 0.17.1.
INFO: Analyzed target //src:bazel_nojdk (211 packages loaded, 10406 targets configured).
INFO: Found 1 target...
ERROR: /root/bazel-0.26.0/src/BUILD:113:1: PythonZipper src/create_embedded_tools.zip failed (Exit 255): zipper failed: error executing command
  (cd /tmp/bazel_EFqfvH2K/out/execroot/io_bazel && \
  exec env - \
    PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \
  bazel-out/host/bin/external/bazel_tools/third_party/ijar/zipper cC bazel-out/host/bin/src/create_embedded_tools.zip @bazel-out/host/bin/src/create_embedded_tools.zip-0.params)
Execution platform: //:default_host_platform
Target //src:bazel_nojdk failed to build
INFO: Elapsed time: 300.092s, Critical Path: 16.63s
INFO: 37 processes: 37 local.
FAILED: Build did NOT complete successfully

ERROR: Could not build Bazel
root@a7ee8833b96c:~/bazel-0.26.0#

I succeeded in building Bazel 0.24.1 with Debian Buster aarch64 + QEMU 4.0.0 + openjdk-11-jdk.
https://github.com/PINTO0309/Bazel_bin.git

The procedure to build the compilation environment is described below.
https://qiita.com/PINTO/items/e117bb0389f2163e2ac8

I would be glad if I could help everyone.

I have been playing with this as well. The zipper in question is not from python. It's a built in third party tool. I have been able to build zipper manually by doing the following:

source scripts/bootstrap/buildenv.sh
BAZEL_JAVAC_OPTS="-J-Xms384m -J-Xmx768m" source scripts/bootstrap/compile.sh
source scripts/bootstrap/bootstrap.sh
bazel_build third_party/ijar:zipper

The compile.sh builds the bazel_build bootstrapped bazel that can build the release bazel. This bazel_build can be used to build zipper which is in third_party/ijar:zipper

It almost looks like the dependency is not there such that it doesn't build zipper before it actually needs it. I don't know what this has to do with qemu. After building zipper I was able to run it. I'm not so great with bazel syntax. Are we able to verify that it's actually building zipper before it tries to use it?

I think the issue with PythonZipper issue only happens with Qemu or docker because Zip seems to be referring to /proc/self/exe or something like that. The same works ok when built with AWS F1.

I have a binary at https://github.com/powderluv/bazel-bin/blob/master/bazel-aarch64-29 if anyone wants it (Ubuntu 18.04 aarch64 JDK8)

Ugly fix, but you get the idea.

diff -rpu a/src/main/cpp/blaze_util_linux.cc b/src/main/cpp/blaze_util_linux.cc
--- a/src/main/cpp/blaze_util_linux.cc  1980-01-01 00:00:00.000000000 -0800
+++ b/src/main/cpp/blaze_util_linux.cc  2019-09-11 23:19:25.444000000 -0700
@@ -85,7 +85,8 @@ string GetSelfPath() {
   // The file to which this symlink points could change contents or go missing
   // concurrent with execution of the Bazel client, so we don't eagerly resolve
   // it.
-  return "/proc/self/exe";
+  static char path[PATH_MAX];
+  return realpath("/proc/self/exe", path);
 }

 uint64_t GetMillisecondsMonotonic() {

Ugly fix, but you get the idea.

diff -rpu a/src/main/cpp/blaze_util_linux.cc b/src/main/cpp/blaze_util_linux.cc
--- a/src/main/cpp/blaze_util_linux.cc  1980-01-01 00:00:00.000000000 -0800
+++ b/src/main/cpp/blaze_util_linux.cc  2019-09-11 23:19:25.444000000 -0700
@@ -85,7 +85,8 @@ string GetSelfPath() {
   // The file to which this symlink points could change contents or go missing
   // concurrent with execution of the Bazel client, so we don't eagerly resolve
   // it.
-  return "/proc/self/exe";
+  static char path[PATH_MAX];
+  return realpath("/proc/self/exe", path);
 }

 uint64_t GetMillisecondsMonotonic() {

@3XX0 Did it work for you? Unfortunately it is still failing when using QEMU with docker buildx:
$ docker buildx build --platform=linux/arm64 -t tensorflow_tf2 . --load

Dockerfile:

FROM nvcr.io/nvidia/l4t-base:r32.2

ENV LD_LIBRARY_PATH=/usr/lib/aarch64-linux-gnu/tegra
ARG TF_VERSION=2.0
ARG BAZEL_VERSION=0.26.1

RUN apt-get -y update && \
        apt-get install -y --force-yes --no-install-recommends \
        curl git \
        wget file ca-certificates \
    vim cmake \
    zip \
    unzip \
    software-properties-common \
    gcc build-essential \
    python3.6 python3-dev python3-h5py build-essential libhdf5-serial-dev hdf5-tools python3-pip zlib1g-dev zip libjpeg8-dev libhdf5-dev \
    openjdk-8-jdk \
        && \
    apt-get clean

RUN wget https://github.com/bazelbuild/bazel/releases/download/${BAZEL_VERSION}/bazel-${BAZEL_VERSION}-dist.zip
RUN unzip bazel-${BAZEL_VERSION}-dist.zip -d bazel-${BAZEL_VERSION}-dist
WORKDIR /bazel-${BAZEL_VERSION}-dist
ADD ./*.patch ./
RUN git apply github_7135.patch
RUN chmod +x /bazel-${BAZEL_VERSION}-dist/compile.sh
ENV JAVA_VERSION=''
ENV EXTRA_BAZEL_ARGS="--host_javabase=@local_jdk//:jdk"
RUN bash /bazel-${BAZEL_VERSION}-dist/compile.sh

Hi I just ran into this issue. I am building version 0.15.0 in particular. Has anyone managed to solve this? Or is this an issue with qemu?

@pigubaoza I think the two commits 4fb26b0 and 72e9822 mentioned above are able to resolve this issue.

@pigubaoza I think the two commits 4fb26b0 and 72e9822 mentioned above are able to resolve this issue.

Thanks @yunqu . I have made the changes in those commits in the 0.15.0 code, as well as followed the instructions you provided here. https://gist.github.com/yunqu/14c4e4a2f91ada05a2ed56043cbcc161

However, I am getting the below error.

ERROR: /opt/bazel/src/main/protobuf/BUILD:35:2: Building src/main/protobuf/libandroid_deploy_info_proto-speed.jar (1 source jar) failed: Worker process returned an unparseable WorkResponse!

15 4783.

15 4783. Did you try to print something to stdout? Workers aren't allowed to do this, as it breaks the protocol between Bazel and the worker process.

15 4783.

15 4783. ---8<---8<--- Exception details ---8<---8<---

15 4783. com.google.protobuf.InvalidProtocolBufferException$InvalidWireTypeException: Protocol message tag had invalid wire type.

15 4783. at com.google.protobuf.InvalidProtocolBufferException.invalidWireType(InvalidProtocolBufferException.java:115)

15 4783. at com.google.protobuf.CodedInputStream$StreamDecoder.skipField(CodedInputStream.java:2100)

15 4783. at com.google.protobuf.GeneratedMessageV3.parseUnknownFieldProto3(GeneratedMessageV3.java:303)

15 4783. at com.google.devtools.build.lib.worker.WorkerProtocol$WorkResponse.(WorkerProtocol.java:1866)

15 4783. at com.google.devtools.build.lib.worker.WorkerProtocol$WorkResponse.(WorkerProtocol.java:1830)

15 4783. at com.google.devtools.build.lib.worker.WorkerProtocol$WorkResponse$1.parsePartialFrom(WorkerProtocol.java:2420)

15 4783. at com.google.devtools.build.lib.worker.WorkerProtocol$WorkResponse$1.parsePartialFrom(WorkerProtocol.java:2415)

15 4783. at com.google.protobuf.AbstractParser.parsePartialFrom(AbstractParser.java:221)

15 4783. at com.google.protobuf.AbstractParser.parsePartialDelimitedFrom(AbstractParser.java:262)

15 4783. at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:275)

15 4783. at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:280)

15 4783. at com.google.protobuf.AbstractParser.parseDelimitedFrom(AbstractParser.java:49)

15 4783. at com.google.protobuf.GeneratedMessageV3.parseDelimitedWithIOException(GeneratedMessageV3.java:347)

15 4783. at com.google.devtools.build.lib.worker.WorkerProtocol$WorkResponse.parseDelimitedFrom(WorkerProtocol.java:2082)

15 4783. at com.google.devtools.build.lib.worker.WorkerSpawnRunner.execInWorker(WorkerSpawnRunner.java:313)

15 4783. at com.google.devtools.build.lib.worker.WorkerSpawnRunner.actuallyExec(WorkerSpawnRunner.java:154)

15 4783. at com.google.devtools.build.lib.worker.WorkerSpawnRunner.exec(WorkerSpawnRunner.java:112)

15 4783. at com.google.devtools.build.lib.exec.AbstractSpawnStrategy.exec(AbstractSpawnStrategy.java:95)

15 4783. at com.google.devtools.build.lib.exec.AbstractSpawnStrategy.exec(AbstractSpawnStrategy.java:63)

15 4783. at com.google.devtools.build.lib.exec.SpawnActionContextMaps$ProxySpawnActionContext.exec(SpawnActionContextMaps.java:362)

15 4783. at com.google.devtools.build.lib.analysis.actions.SpawnAction.internalExecute(SpawnAction.java:287)

15 4783. at com.google.devtools.build.lib.analysis.actions.SpawnAction.execute(SpawnAction.java:294)

15 4783. at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor.executeActionTask(SkyframeActionExecutor.java:960)

15 4783. at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor.prepareScheduleExecuteAndCompleteAction(SkyframeActionExecutor.java:891)

15 4783. at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor.access$900(SkyframeActionExecutor.java:115)

15 4783. at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$ActionRunner.call(SkyframeActionExecutor.java:746)

15 4783. at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$ActionRunner.call(SkyframeActionExecutor.java:700)

15 4783. at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)

15 4783. at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor.executeAction(SkyframeActionExecutor.java:442)

15 4783. at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.checkCacheAndExecuteIfNeeded(ActionExecutionFunction.java:503)

15 4783. at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.compute(ActionExecutionFunction.java:224)

15 4783. at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:382)

15 4783. at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:355)

15 4783. at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)

15 4783. at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)

15 4783. at java.base/java.lang.Thread.run(Thread.java:834)

15 4783. ---8<---8<--- End of exception details ---8<---8<---

15 4783.

15 4783. ---8<---8<--- Start of log ---8<---8<---

15 4783. -Xbootclasspath/p is no longer a supported option.

15 4783. ion.

15 4783. ---8<---8<--- End of log ---8<---8<---

Target //src:bazel failed to build
INFO: Elapsed time: 3344.130s, Critical Path: 1067.20s
INFO: 1053 processes: 1053 local.
FAILED: Build did NOT complete successfully

15 4791.

15 4792. ERROR: Could not build Bazel

15 ERROR: executor failed running [/bin/sh -c ./compile.sh]: buildkit-runc did not terminate successfully

------84.

[base 12/13] RUN ./compile.sh:
------83.
failed to solve: rpc error: code = Unknown desc = executor failed running [/bin/sh -c ./compile.sh]: buildkit-runc did not terminate successfully

Do you know what might be the issue?

I have fixed the issue. My mistake was using openjdk-11-jdk rather than openjdk-8-jdk. Thanks for your help!

Was this page helpful?
0 / 5 - 0 ratings