Describe the bug
Build fails because Bazel rejects the --incompatible_no_support_tools_in_action_inputs command line flag. This flag used to exist in Bazel 0.27, but no longer exists in Bazel 2.0, which is what the nixpkgs tensorflow package currently uses.
bazelFlags = [
# temporary fixes to make the build work with bazel 0.27
"--incompatible_no_support_tools_in_action_inputs=false"
"--incompatible_use_native_patch=false"
];
To Reproduce
Steps to reproduce the behavior:
nix-shell -p python37 -p "pkgs.lib.callPackageWith pkgs python37Packages.tensorflowWithCuda { cudaSupport = true; }" -I nixpkgs=https://github.com/NixOS/nixpkgs/archive/60d72d12dea298599f497865fa83b5188ff7e467.tar.gz
Expected behavior
Tensorflow build works.
Screenshots
Configuration finished
building
Extracting Bazel installation...
Starting local Bazel server and connecting to it...
ERROR: Unrecognized option: --incompatible_use_native_patch=false
builder for '/nix/store/0ab1jbzhm43zi2rkwkbipr949qr7vm9d-tensorflow-gpu-1.15.0-deps.drv' failed with exit code 2
cannot build derivation '/nix/store/nwirkpa9xz6xpnv5ac7d4z2kmmkpfz33-tensorflow-gpu-1.15.0.drv': 1 dep
Additional context
Add any other context about the problem here.
Metadata
Please run nix-shell -p nix-info --run "nix-info -m" and paste the result.
Maintainer information:
# a list of nixpkgs attributes affected by the problem
attribute:
- python37Packages.tensorflowWithCuda
# a list of nixos modules affected by the problem
module:
@GeorgeFlerovsky-Canada could you try filing a PR fixing this issue?
Sorry, I've got a brutal week at work. Perhaps I'll try next week
When I attempted to simply remove the flag --incompatible_use_native_patch=false I received the following error in the middle of the build:
ERROR: Evaluation of query "deps((//tensorflow/tools/pip_package:build_pip_package union //tensorflow/tools/lib_package:libtensorflow))" failed: errors were encountered while computing transitive closure
Well, then I'm out of ideas :)
A somewhat orthogonal solution (see thread I guess, tldr: keep a version of bazel around for tensorflow) https://github.com/NixOS/nixpkgs/pull/76851#issuecomment-580036845
Officially the supported bazel versions for 1.15.2 are
_TF_MIN_BAZEL_VERSION = '0.24.1'
_TF_MAX_BAZEL_VERSION = '0.26.1'
Although bazel 1.2.1 seems to work. I'm getting failures on pulling the MKL dependencies on bazel 2, which is strange given I'm not enabling MKL during the build...
ERROR: /build/output/external/mkl_dnn/BUILD.bazel:53:1: no such package '@mkl_dnn//third_party/mkl_dnn': BUILD file not found in directory 'third_party/mkl_dnn' of external repository @mkl_dnn. Add a BUILD file to a directory to mark it as a package. and referenced by '@mkl_dnn//:mkl_dnn'
Loading: 312 packages loaded
ERROR: /build/output/external/mkl_dnn/BUILD.bazel:53:1: no such package '@mkl_dnn//third_party/mkl_dnn': BUILD file not found in directory 'third_party/mkl_dnn' of external repository @mkl_dnn. Add a BUILD file to a directory to mark it as a package. and referenced by '@mkl_dnn//:mkl_dnn'
Loading: 312 packages loaded
ERROR: /build/output/external/mkl_dnn/BUILD.bazel:53:1: no such package '@mkl_dnn//third_party/mkl_dnn': BUILD file not found in directory 'third_party/mkl_dnn' of external repository @mkl_dnn. Add a BUILD file to a directory to mark it as a package. and referenced by '@mkl_dnn//:mkl_dnn'
Loading: 312 packages loaded
Maybe related? https://github.com/tensorflow/tensorflow/issues/36717
I have a branch (currently not working) where I've added back bazel 1 https://github.com/mjlbach/nixpkgs/tree/bazel_1
I am suffering from the similar issue. Now I am using nix package revison d097edb4bbee4d3efc96481f473f38819e85f421 and the building error is
configure: interpreter directive changed from "/usr/bin/env bash" to "/nix/store/x7hj5hp24flxbx6jnnws5rnjs31g0ybw-bash-4.4-p23/bin/bash"
Extracting Bazel installation...
You have bazel 2.0.0- (@non-git) installed.
Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]: No OpenCL SYCL support will be enabled for TensorFlow.
Do you wish to build TensorFlow with ROCm support? [y/N]: No ROCm support will be enabled for TensorFlow.
Do you wish to build TensorFlow with TensorRT support? [y/N]: No TensorRT support will be enabled for TensorFlow.
Found CUDA 10.2 in:
/nix/store/qf92vhg199g1hiz7snargyvgdx3ric3i-cudatoolkit-10.2.89-merged/lib
/nix/store/qf92vhg199g1hiz7snargyvgdx3ric3i-cudatoolkit-10.2.89-merged/include
Found cuDNN 7 in:
/nix/store/v0f99k56xzvgnc183vjq9p9whlqkj4fh-cudatoolkit-10.1-cudnn-7.6.3/lib64
/nix/store/v0f99k56xzvgnc183vjq9p9whlqkj4fh-cudatoolkit-10.1-cudnn-7.6.3/include
Do you want to use clang as CUDA compiler? [y/N]: nvcc will be used as CUDA compiler.
Please specify the MPI toolkit folder. [Default is /nix/store/f1c9zf9j4731jha01zxbh3rwiiyv405z-openmpi-4.0.2]:
Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native -Wno-sign-compare]:
Would you like to interactively configure ./WORKSPACE for Android builds? [y/N]: Not configuring the WORKSPACE for Android builds.
Preconfigured Bazel build configs. You can use any of the below by adding "--config=<>" to your build command. See .bazelrc for more details.
--config=mkl # Build with MKL support.
--config=monolithic # Config for mostly static monolithic build.
--config=gdr # Build with GDR support.
--config=verbs # Build with libverbs support.
--config=ngraph # Build with Intel nGraph support.
--config=numa # Build with NUMA support.
--config=dynamic_kernels # (Experimental) Build kernels into separate shared objects.
--config=v2 # Build TensorFlow 2.x instead of 1.x.
Preconfigured Bazel build configs to DISABLE default on features:
--config=noaws # Disable AWS S3 filesystem support.
--config=nogcp # Disable GCP support.
--config=nohdfs # Disable HDFS support.
--config=noignite # Disable Apache Ignite support.
--config=nokafka # Disable Apache Kafka support.
--config=nonccl # Disable NVIDIA NCCL support.
Configuration finished
building
Extracting Bazel installation...
Starting local Bazel server and connecting to it...
ERROR: Unrecognized option: --incompatible_use_native_patch=false
builder for '/nix/store/w08zmr8isfmc179nqkp6c6c5vyqcw8rd-tensorflow-gpu-1.15.0-deps.drv' failed with exit code 2
cannot build derivation '/nix/store/1c0jy172k32hny3lglv54garga4yrbmh-tensorflow-gpu-1.15.0.drv': 1 dependencies couldn't be built
cannot build derivation '/nix/store/cnhlklvjnrwnhpkc1fxbzjsvppml93vc-python3.7-tensorflow-gpu-1.15.0.drv': 1 dependencies couldn't be built
@mjlbach does adding back bazel 1 fix the problem? If so, we should forwardport the bazel 1 for TF 1.
@CMCDragonkai I'm currently bisecting the bazel build issue and will update when I figure out which dependency in master is causing the bazel 1 build to fail.
I wrote this to test bazel build failures with git bisect: set NIXPKGS to your local repo and be sure that there aren't any uncommitted changes (it calls git reset after replacing the bazel with 1.2.1). https://github.com/mjlbach/nix-bisect-bazel
It seems this update is causing the build failure:
https://github.com/NixOS/nixpkgs/commit/e80a85a0579ed226698d34f6388dc1cf9e7924bb
I applied a patch and bazel 1.2.1 now builds (see #80970)
Tensorflow will also fail until glibc is patched, see:
https://github.com/tensorflow/tensorflow/issues/33758
This is the actual patch that needs to be applied: https://github.com/tensorflow/tensorflow/compare/master...hi-ogawa:grpc-backport-pr-18950
This branch has all of the necessary tensorflow and bazel patches applied:
Most helpful comment
@CMCDragonkai I'm currently bisecting the bazel build issue and will update when I figure out which dependency in master is causing the bazel 1 build to fail.