Bazel: Remote worker throwing exception on upload of docker_build cache

Created on 15 Mar 2017  Â·  18Comments  Â·  Source: bazelbuild/bazel

Please provide the following information. The more we know about your system and use case, the more easily and likely we can help.

Description of the problem / feature request / question:

Remote worker throwing exception on upload of docker_build cache.

Genrule:

genrule(
    name = "java8",
    srcs = [],
    outs = ["java8.tar"],
    cmd = "docker pull openjdk:8; docker save openjdk:8 > $@",
    local = 1,
)

Docker build:

load("@bazel_tools//tools/build_defs/docker:docker.bzl", "docker_build")

docker_build(
    name = "image",
    base = ":java8",
    cmd = "echo 'hello world'",
)

Bazel build exceptiong:

bazel build //third_party/docker:image
INFO: Reading 'startup' options from /home/preston/.bazelrc: --host_jvm_args=-Dbazel.DigestFunction=SHA1
INFO: Found 1 target...
Target //third_party/docker:image failed to build
Use --verbose_failures to see the command lines of failed build steps.
Unhandled exception thrown during build; message: Unrecoverable error while evaluating node 'ACTION_EXECUTION:action 'CreateImage third_party/docker/image-layer.tar' (CreateImage[[Artifact:[[/home/preston/.cache/bazel/_bazel_preston/7fcd27cc7c3bb4755396ba74e5927be1/execroot/workspace]bazel-out/local-fastbuild/bin]third_party/docker/image.config.sha256, Artifact:[[/home/preston/.cache/bazel/_bazel_preston/7fcd27cc7c3bb4755396ba74e5927be1/execroot/workspace]bazel-out/local-fastbuild/bin]third_party/docker/image.config, Artifact:[[/home/preston/.cache/bazel/_bazel_preston/7fcd27cc7c3bb4755396ba74e5927be1/execroot/workspace]bazel-out/local-fastbuild/bin]third_party/docker/image.layer.sha256, Artifact:[[/home/preston/.cache/bazel/_bazel_preston/7fcd27cc7c3bb4755396ba74e5927be1/execroot/workspace]bazel-out/local-fastbuild/bin]third_party/docker/image.layer, Artifact:[[/home/preston/.cache/bazel/_bazel_preston/7fcd27cc7c3bb4755396ba74e5927be1/execroot/workspace]bazel-out/local-fastbuild/bin]third_party/docker/image.metadata-name.sha256, Artifact:[[/home/preston/.cache/bazel/_bazel_preston/7fcd27cc7c3bb4755396ba74e5927be1/execroot/workspace]bazel-out/local-fastbuild/bin]third_party/docker/image.metadata, Artifact:[[/home/preston/.cache/bazel/_bazel_preston/7fcd27cc7c3bb4755396ba74e5927be1/execroot/workspace]bazel-out/local-fastbuild/genfiles]third_party/docker/java8.tar, Artifact:[[/home/preston/.cache/bazel/_bazel_preston/7fcd27cc7c3bb4755396ba74e5927be1/execroot/workspace]bazel-out/host/bin]external/bazel_tools/tools/build_defs/docker/create_image, Artifact:[/home/preston/.cache/bazel/_bazel_preston/7fcd27cc7c3bb4755396ba74e5927be1[source]]external/bazel_tools/tools/build_defs/docker/create_image.py, Artifact:[[/home/preston/.cache/bazel/_bazel_preston/7fcd27cc7c3bb4755396ba74e5927be1/execroot/workspace]bazel-out/host/internal]_middlemen/external_Sbazel_Utools_Stools_Sbuild_Udefs_Sdocker_Screate_Uimage-runfiles] -> [Artifact:[[/home/preston/.cache/bazel/_bazel_preston/7fcd27cc7c3bb4755396ba74e5927be1/execroot/workspace]bazel-out/local-fastbuild/bin]third_party/docker/image-layer.tar]])' (requested by nodes 'ARTIFACT:third_party/docker/image-layer.tar //third_party/docker:image a1aa5b17a64171df0d4f40bcacc0492a (527716221 628672696)')
INFO: Elapsed time: 20.140s, Critical Path: 17.98s
java.lang.RuntimeException: Unrecoverable error while evaluating node 'ACTION_EXECUTION:action 'CreateImage third_party/docker/image-layer.tar' (CreateImage[[Artifact:[[/home/preston/.cache/bazel/_bazel_preston/7fcd27cc7c3bb4755396ba74e5927be1/execroot/workspace]bazel-out/local-fastbuild/bin]third_party/docker/image.config.sha256, Artifact:[[/home/preston/.cache/bazel/_bazel_preston/7fcd27cc7c3bb4755396ba74e5927be1/execroot/workspace]bazel-out/local-fastbuild/bin]third_party/docker/image.config, Artifact:[[/home/preston/.cache/bazel/_bazel_preston/7fcd27cc7c3bb4755396ba74e5927be1/execroot/workspace]bazel-out/local-fastbuild/bin]third_party/docker/image.layer.sha256, Artifact:[[/home/preston/.cache/bazel/_bazel_preston/7fcd27cc7c3bb4755396ba74e5927be1/execroot/workspace]bazel-out/local-fastbuild/bin]third_party/docker/image.layer, Artifact:[[/home/preston/.cache/bazel/_bazel_preston/7fcd27cc7c3bb4755396ba74e5927be1/execroot/workspace]bazel-out/local-fastbuild/bin]third_party/docker/image.metadata-name.sha256, Artifact:[[/home/preston/.cache/bazel/_bazel_preston/7fcd27cc7c3bb4755396ba74e5927be1/execroot/workspace]bazel-out/local-fastbuild/bin]third_party/docker/image.metadata, Artifact:[[/home/preston/.cache/bazel/_bazel_preston/7fcd27cc7c3bb4755396ba74e5927be1/execroot/workspace]bazel-out/local-fastbuild/genfiles]third_party/docker/java8.tar, Artifact:[[/home/preston/.cache/bazel/_bazel_preston/7fcd27cc7c3bb4755396ba74e5927be1/execroot/workspace]bazel-out/host/bin]external/bazel_tools/tools/build_defs/docker/create_image, Artifact:[/home/preston/.cache/bazel/_bazel_preston/7fcd27cc7c3bb4755396ba74e5927be1[source]]external/bazel_tools/tools/build_defs/docker/create_image.py, Artifact:[[/home/preston/.cache/bazel/_bazel_preston/7fcd27cc7c3bb4755396ba74e5927be1/execroot/workspace]bazel-out/host/internal]_middlemen/external_Sbazel_Utools_Stools_Sbuild_Udefs_Sdocker_Screate_Uimage-runfiles] -> [Artifact:[[/home/preston/.cache/bazel/_bazel_preston/7fcd27cc7c3bb4755396ba74e5927be1/execroot/workspace]bazel-out/local-fastbuild/bin]third_party/docker/image-layer.tar]])' (requested by nodes 'ARTIFACT:third_party/docker/image-layer.tar //third_party/docker:image a1aa5b17a64171df0d4f40bcacc0492a (527716221 628672696)')
    at com.google.devtools.build.skyframe.ParallelEvaluator$Evaluate.run(ParallelEvaluator.java:438)
    at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:501)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: gRPC terminated prematurely: java.lang.RuntimeException: java.lang.IllegalArgumentException
    at com.google.devtools.build.lib.remote.GrpcActionCache.uploadChunks(GrpcActionCache.java:532)
    at com.google.devtools.build.lib.remote.GrpcActionCache.uploadAllResults(GrpcActionCache.java:413)
    at com.google.devtools.build.lib.remote.RemoteSpawnStrategy.execLocally(RemoteSpawnStrategy.java:121)
    at com.google.devtools.build.lib.remote.RemoteSpawnStrategy.exec(RemoteSpawnStrategy.java:228)
    at com.google.devtools.build.lib.analysis.actions.SpawnAction.internalExecute(SpawnAction.java:265)
    at com.google.devtools.build.lib.analysis.actions.SpawnAction.execute(SpawnAction.java:273)
    at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor.executeActionTask(SkyframeActionExecutor.java:783)
    at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor.prepareScheduleExecuteAndCompleteAction(SkyframeActionExecutor.java:723)
    at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor.access$800(SkyframeActionExecutor.java:102)
    at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$ActionRunner.call(SkyframeActionExecutor.java:613)
    at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$ActionRunner.call(SkyframeActionExecutor.java:575)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor.executeAction(SkyframeActionExecutor.java:380)
    at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.checkCacheAndExecuteIfNeeded(ActionExecutionFunction.java:444)
    at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.compute(ActionExecutionFunction.java:197)
    at com.google.devtools.build.skyframe.ParallelEvaluator$Evaluate.run(ParallelEvaluator.java:373)
    ... 4 more
java.lang.RuntimeException: Unrecoverable error while evaluating node 'ACTION_EXECUTION:action 'CreateImage third_party/docker/image-layer.tar' (CreateImage[[Artifact:[[/home/preston/.cache/bazel/_bazel_preston/7fcd27cc7c3bb4755396ba74e5927be1/execroot/workspace]bazel-out/local-fastbuild/bin]third_party/docker/image.config.sha256, Artifact:[[/home/preston/.cache/bazel/_bazel_preston/7fcd27cc7c3bb4755396ba74e5927be1/execroot/workspace]bazel-out/local-fastbuild/bin]third_party/docker/image.config, Artifact:[[/home/preston/.cache/bazel/_bazel_preston/7fcd27cc7c3bb4755396ba74e5927be1/execroot/workspace]bazel-out/local-fastbuild/bin]third_party/docker/image.layer.sha256, Artifact:[[/home/preston/.cache/bazel/_bazel_preston/7fcd27cc7c3bb4755396ba74e5927be1/execroot/workspace]bazel-out/local-fastbuild/bin]third_party/docker/image.layer, Artifact:[[/home/preston/.cache/bazel/_bazel_preston/7fcd27cc7c3bb4755396ba74e5927be1/execroot/workspace]bazel-out/local-fastbuild/bin]third_party/docker/image.metadata-name.sha256, Artifact:[[/home/preston/.cache/bazel/_bazel_preston/7fcd27cc7c3bb4755396ba74e5927be1/execroot/workspace]bazel-out/local-fastbuild/bin]third_party/docker/image.metadata, Artifact:[[/home/preston/.cache/bazel/_bazel_preston/7fcd27cc7c3bb4755396ba74e5927be1/execroot/workspace]bazel-out/local-fastbuild/genfiles]third_party/docker/java8.tar, Artifact:[[/home/preston/.cache/bazel/_bazel_preston/7fcd27cc7c3bb4755396ba74e5927be1/execroot/workspace]bazel-out/host/bin]external/bazel_tools/tools/build_defs/docker/create_image, Artifact:[/home/preston/.cache/bazel/_bazel_preston/7fcd27cc7c3bb4755396ba74e5927be1[source]]external/bazel_tools/tools/build_defs/docker/create_image.py, Artifact:[[/home/preston/.cache/bazel/_bazel_preston/7fcd27cc7c3bb4755396ba74e5927be1/execroot/workspace]bazel-out/host/internal]_middlemen/external_Sbazel_Utools_Stools_Sbuild_Udefs_Sdocker_Screate_Uimage-runfiles] -> [Artifact:[[/home/preston/.cache/bazel/_bazel_preston/7fcd27cc7c3bb4755396ba74e5927be1/execroot/workspace]bazel-out/local-fastbuild/bin]third_party/docker/image-layer.tar]])' (requested by nodes 'ARTIFACT:third_party/docker/image-layer.tar //third_party/docker:image a1aa5b17a64171df0d4f40bcacc0492a (527716221 628672696)')
    at com.google.devtools.build.skyframe.ParallelEvaluator$Evaluate.run(ParallelEvaluator.java:438)
    at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:501)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: gRPC terminated prematurely: java.lang.RuntimeException: java.lang.IllegalArgumentException
    at com.google.devtools.build.lib.remote.GrpcActionCache.uploadChunks(GrpcActionCache.java:532)
    at com.google.devtools.build.lib.remote.GrpcActionCache.uploadAllResults(GrpcActionCache.java:413)
    at com.google.devtools.build.lib.remote.RemoteSpawnStrategy.execLocally(RemoteSpawnStrategy.java:121)
    at com.google.devtools.build.lib.remote.RemoteSpawnStrategy.exec(RemoteSpawnStrategy.java:228)
    at com.google.devtools.build.lib.analysis.actions.SpawnAction.internalExecute(SpawnAction.java:265)
    at com.google.devtools.build.lib.analysis.actions.SpawnAction.execute(SpawnAction.java:273)
    at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor.executeActionTask(SkyframeActionExecutor.java:783)
    at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor.prepareScheduleExecuteAndCompleteAction(SkyframeActionExecutor.java:723)
    at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor.access$800(SkyframeActionExecutor.java:102)
    at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$ActionRunner.call(SkyframeActionExecutor.java:613)
    at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$ActionRunner.call(SkyframeActionExecutor.java:575)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor.executeAction(SkyframeActionExecutor.java:380)
    at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.checkCacheAndExecuteIfNeeded(ActionExecutionFunction.java:444)
    at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.compute(ActionExecutionFunction.java:197)
    at com.google.devtools.build.skyframe.ParallelEvaluator$Evaluate.run(ParallelEvaluator.java:373)
    ... 4 more

Build worker logged exception:

Mar 14, 2017 9:10:50 PM com.google.devtools.build.remote.RemoteWorker$CasServer$1 onNext
WARNING: Request failed: java.lang.IllegalArgumentException
Mar 14, 2017 9:10:50 PM com.google.devtools.build.remote.RemoteWorker$CasServer$1 onNext
WARNING: Request failed: java.lang.IllegalArgumentException: Missing input chunk for digest <digest: ccb4e763bd55ce04d008fbefb09d9be4432fbbfd, size: 699234304 bytes>
Mar 14, 2017 9:10:50 PM com.google.devtools.build.remote.RemoteWorker$CasServer$1 onNext
WARNING: Request failed: java.lang.IllegalArgumentException: Missing input chunk for digest <digest: ccb4e763bd55ce04d008fbefb09d9be4432fbbfd, size: 699234304 bytes>
Mar 14, 2017 9:10:50 PM com.google.devtools.build.remote.RemoteWorker$CasServer$1 onError
WARNING: Request errored remotely: io.grpc.StatusException: CANCELLED

If possible, provide a minimal example to reproduce the problem:

Remote worker running as such: bazel-bin/src/tools/remote_worker/remote_worker --work_path /tmp --listen_port=8080

Building that docker_build above with build --spawn_strategy=remote --remote_cache=172.17.0.3:8080 in ~/.bazelrc

Environment info

  • Operating System: Ubuntu 16.04

  • Bazel version (output of bazel info release):
    release 0.4.4

  • If bazel info release returns "development version" or "(@non-git)", please tell us what source tree you compiled Bazel from; git commit hash is appreciated (git rev-parse HEAD):
    The worker is running from commit 2046bb480075a8f412cb51882e64e31324fc57de

Have you found anything relevant by searching the web? (e.g. GitHub issues, email threads in the [email protected] archive)

Nope.

Anything else, information or logs or outputs that would be helpful?

(If they are large, please upload as attachment or provide link).

bug under investigation

All 18 comments

@philwo Can you help with this?

Pinging @ola-rozenfeld and @hhclam, our experts on remote execution.

I'll investigate, will update.

It seems you are running into a limitation of the cache. The blob that is being set for upload is too large:
https://github.com/bazelbuild/bazel/blame/bb5901ba0474eb2ddd035502663026bcb0c05b7c/src/main/java/com/google/devtools/build/lib/remote/ConcurrentMapActionCache.java#L146

I'm looking into what can be done here. At the very least I'll include a more informative error message in the IllegalArgumentException in the worker log.

Good catch! The hazelcast / rest cache implementation does not support objects larger than 500MB for memory consumption reasons. I wanted to keep the implementation simple. The gRPC caching api does support chunking and will solve this problem.

Has there been any progress on this? I'm running into the same problem - remote caching with docker appears to be a no-go...

No, sorry, we haven't invested in improving the Hazelcast implementation,
because that was supposed to be only a prototype/proof of concept (there is
even a TODO somewhere to remove it...). We are working on the gRPC
implementation.

On Fri, Apr 21, 2017 at 9:14 AM, Ryan Michael notifications@github.com
wrote:

Has there been any progress on this? I'm running into the same problem -
remote caching with docker appears to be a no-go...

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/bazelbuild/bazel/issues/2680#issuecomment-296187058,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AYoKuKlV1vI8jlQVOfQH3Zu57tCCB1Rcks5ryKulgaJpZM4MdR9W
.

I happened to notice this thread, and wanted to chime in (if only to hijack a bit).

You should consider using the new bazelbuild/rules_docker for pulling your image down (it supports pushing too).

Are you just looking for a java8 base to overlay your deploy jar on?

There is not plan to add chunking to support larger than 500MB objects for REST / hazelcast backend. Like Ola said we will remove the Hazelcast backend in the future.

I'm actually building a base image for golang binaries with the go toolchain. I've been planning to change from dumping a docker tarball to using docker_pull now that that's available. Is there a remote caching implementation that doesn't rely on Hazelcast?

With ToT you can use REST for remote caching.

https://github.com/bazelbuild/bazel/blob/master/src/main/java/com/google/devtools/build/lib/remote/README.md

The same limitation of 500MB applies though.

FWIW, if you are just looking for a java8 base, there are potential alternatives to the official image.

I prototyped this a little while back.

This approach may be viable for you as the images are (often dramatically) smaller (at least under 500MB). If it is, feel free to reach out to me directly (GitHub handle at google.com).

Just to follow up, you can try: github.com/googlecloudplatform/distroless

@ola-rozenfeld Can the gRPC implementation handle large files now?

I could be convinced to add chunking to the REST implementation.

Yes, there is no limit on the size from the gRPC side. We chunk everything
into 16K chunks. The 500 limit is hardcoded as a memory limit in
SimpleBlobStoreActionCache.

On Fri, Jun 16, 2017 at 9:21 AM, Ulf Adams notifications@github.com wrote:

I could be convinced to add chunking to the REST implementation.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/bazelbuild/bazel/issues/2680#issuecomment-309024721,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AYoKuAqU7iIRc0fp49yI0WoKcH6E3PAFks5sEoFsgaJpZM4MdR9W
.

Yeah I think chunking makes sense. The hardcoded limit is in place to avoid OOM of the JVM, which will be solved ultimately with chunking.

Closing this since the gRPC-based protocol can handle large files now. I filed #3250 for the test cache.

Was this page helpful?
0 / 5 - 0 ratings