Bazel: Can py_runtime use a label cc_binary target as the interpreter?

Created on 13 Dec 2017 · 19Comments · Source: bazelbuild/bazel

Please provide the following information. The more we know about your system and use case, the more easily and likely we can help.

Description of the problem / feature request / question:

I want to

build the cpython project with Bazel, and
use the Bazel-built target to create my own py_runtime.

I have completely built the python binary target (a cc_binary target defined in a BUILD file) from the cpython source tree using Bazel. Check.
Now I am trying to reference the python binary target from the py_runtime rule.

I want to define py_runtime as the follows:

py_runtime(
    name = 'py-3.7',
    visibility = [
        '//visibility:public',
    ],
    files = glob([
        '//third_party/cpython:runtime',
    ]),
    interpreter = '//third_party/cpython:python',
)

The //third_party/cpython:python is the cc_binary.
The //third_party/cpython:runtime is the standard python runtime library. This is irrelevant to this issue.

If possible, provide a minimal example to reproduce the problem:

The snippet shown above is good enough for this purpose.

Environment info

Operating System:
Ubuntu 16.04
Bazel version (output of bazel info release):
release 0.7.0
If bazel info release returns "development version" or "(@non-git)", please tell us what source tree you compiled Bazel from; git commit hash is appreciated (git rev-parse HEAD):

Have you found anything relevant by searching the web?

Not really. I looked at the py_runtime doc (https://docs.bazel.build/versions/master/be/python.html#py_runtime).

Looks like the interpreter argument must point to an actual binary file. But I do not want to check in the python binary, yet. Instead, I want to build the python binary from my source tree.

Anything else, information or logs or outputs that would be helpful?

(If they are large, please upload as attachment or provide link).

P3 team-Rules-Python feature request

Source

qzmfranklin

👍6 👀2

Most helpful comment

Hi @erain ,

I am aware of how to use py_runtime the same way, i.e., the interpreter attribute points to an absolute path.

What I want is to extend the capability of the interpreter attribute of py_runtime to be able to point to a label.

qzmfranklin on 14 Dec 2017

👍7

All 19 comments

To clarify, are you requesting that py_runtime.interpreter be supporting labels, not just absolute paths?

laszlocsomor on 13 Dec 2017

/cc @erain

laszlocsomor on 13 Dec 2017

Hi qzmfranklin,

Haven't worked on py_runtime for a while, and currently it only supports either absolute path or a checked-in python interpreter. I don't think label is supported now.

For a checked in py_runtime, you can refer to my example project:
https://github.com/erain/bazel-python-example

erain on 14 Dec 2017

@laszlocsomor
Sorry for the late reply.

Per 'support labels': Yes, exactly.

qzmfranklin on 14 Dec 2017

Hi @erain ,

I am aware of how to use py_runtime the same way, i.e., the interpreter attribute points to an absolute path.

What I want is to extend the capability of the interpreter attribute of py_runtime to be able to point to a label.

qzmfranklin on 14 Dec 2017

👍7

Reviewing this, I'm surprised you're not able to do this already. py_runtime.interpreter_path is a string for the path of a platform-installed runtime, and py_runtime.interpreter is for the label of an in-build runtime, which should be able to be either checked-in or compiled with Bazel.

Can you confirm that this is still a problem, and explain the failure you see?

brandjon on 14 Mar 2019

@brandjon

Thanks for checking this issue. I just re-tested using bazel 0.23.1. Here is a brief of the result:

tools/python/BUILD:

py_runtime(
    name = 'py-3',
    files = [],
    interpreter = ':python3', # Error
    visibility = [
        '//visibility:public',
    ],
)

sh_binary(
    name = 'python3',
    srcs = ['python.sh'],
)

tools/python/python.sh:

#!/usr/bin/env python3

Using bazel --python_top=tools/python:py-3 to build and run py_library/py_binary targets yields the following error:

ERROR: /home/zhongming/git/LogiOcean/tools/python/BUILD:20:19: in interpreter attribute of py_runtime rule //tools/python:py-3: '//tools/python:python3' must produce a single file

The only way I can get py_runtime.interpreter to work is by making the string to point to a file checked into Bazel. For example:

py_runtime(
    name = 'py-3',
    files = [],
    interpreter = 'python.sh', # OK
    visibility = [
        '//visibility:public',
    ],
)

qzmfranklin on 22 Mar 2019

The way I wanted it to work is that I can compile the python interpreter from within Bazel, and then make py_runtime.interpreter to point to the cc_binary target of the python interpreter.

qzmfranklin on 22 Mar 2019

Ok, I understand better now. The issue is that the py_runtime rule operates in a non-standard way for the case where it references an in-workspace interpreter (as opposed to the absolute path of a platform interpreter).

On consultation with @aragos, the way it should work is that it should have interpreter point to the label of an executable target, which can be a sh_binary, cc_binary, etc., or even a checked-in binary that would be a source file from Bazel's POV. It should then invoke the executable file associated with this target, supported by the target's runfiles. (For the case of a checked-in binary, we'd probably have to still use files since it wouldn't have any runfiles that Bazel knows about.)

But what it's actually doing is assuming that interpreter points to a target that produces only a single file, and taking that as the executable. This is wrong for virtually all binary rules and really only works for the case of a checked-in interpreter.

This will require redesigning the py_runtime rule, and possibly splitting it into separate rules for the three cases of a platform runtime (absolute system path), a checked-in binary, and Bazel-built binary. (The latter two would be split because a Bazel-built binary shouldn't need a files attribute.)

In the meantime, I have a horribly hacky proof of concept for a workaround: Define two Starlark adaptor rules that take in an executable target and yield as their files-to-build 1) just the executable and nothing else, and 2) all of the runfiles. Then feed the adapted targets as the interpreter and files of the py_runtime. Example:

pkg/script.sh

#!/bin/bash

###############################################################################
# This is a simulated Python runtime. All it does is load a runfile           #
# (helper.sh) and emit some dummy text. The boilerplate is to support loading #
# runfiles.                                                                   #
###############################################################################

# --- begin runfiles.bash initialization ---
# Copy-pasted from Bazel's Bash runfiles library (tools/bash/runfiles/runfiles.bash).
set -euo pipefail
if [[ ! -d "${RUNFILES_DIR:-/dev/null}" && ! -f "${RUNFILES_MANIFEST_FILE:-/dev/null}" ]]; then
  if [[ -f "$0.runfiles_manifest" ]]; then
    export RUNFILES_MANIFEST_FILE="$0.runfiles_manifest"
  elif [[ -f "$0.runfiles/MANIFEST" ]]; then
    export RUNFILES_MANIFEST_FILE="$0.runfiles/MANIFEST"
  elif [[ -f "$0.runfiles/bazel_tools/tools/bash/runfiles/runfiles.bash" ]]; then
    export RUNFILES_DIR="$0.runfiles"
  fi
fi
if [[ -f "${RUNFILES_DIR:-/dev/null}/bazel_tools/tools/bash/runfiles/runfiles.bash" ]]; then
  source "${RUNFILES_DIR}/bazel_tools/tools/bash/runfiles/runfiles.bash"
elif [[ -f "${RUNFILES_MANIFEST_FILE:-/dev/null}" ]]; then
  source "$(grep -m1 "^bazel_tools/tools/bash/runfiles/runfiles.bash " \
            "$RUNFILES_MANIFEST_FILE" | cut -d ' ' -f 2-)"
else
  echo >&2 "ERROR: cannot find @bazel_tools//tools/bash/runfiles:runfiles.bash"
  exit 1
fi
# --- end runfiles.bash initialization ---

# Runfiles are located using workspace name (default "__main__") and package
# path.
source $(rlocation __main__/pkg/helper.sh)

echo "I am script!"

pkg/helper.sh

echo "I am helper!"

pkg/BUILD

load(":adaptors.bzl", "get_executable", "get_runfiles")

# :helper and :script define a mock Bazel-built Python runtime.

sh_library(
    name = "helper",
    srcs = ["helper.sh"],
)

sh_binary(
    name = "script",
    srcs = ["script.sh"],
    deps = ["@bazel_tools//tools/bash/runfiles", ":helper"],
)

# The adaptors export just the pieces of :script that py_runtime expects
# to see in its `interpreter` and `files` attributes.

get_executable(
    name = "script_executable",
    target = ":script",
)

get_runfiles(
    name = "script_runfiles",
    target = ":script",
)

py_runtime(
    name = "runtime",
    interpreter = ":script_executable",
    files = [":script_runfiles"],
)

py_binary(
    name = "pybin",
    # pybin.py can just be an empty file, it isn't run by the mock runtime.
    srcs = [":pybin.py"],
)

pkg/adaptors.bzl

def _get_executable_impl(ctx):
    return [DefaultInfo(files=depset([ctx.executable.target]))]

get_executable = rule(
    implementation = _get_executable_impl,
    attrs = {"target": attr.label(executable=True, cfg="target")},
)

def _get_runfiles_impl(ctx):
    info = ctx.attr.target[DefaultInfo]
    all_files = depset(transitive=[
        info.default_runfiles.files,
        info.data_runfiles.files,
        info.files,
    ])
    return [DefaultInfo(files=all_files)]

get_runfiles = rule(
    implementation = _get_runfiles_impl,
    attrs = {"target": attr.label(executable=True, cfg="target")},
)

Console

bazel run //pkg:pybin --python_top="//pkg:runtime"
[...]
I am helper!
I am script!

brandjon on 22 Mar 2019

👍2

@brandjon

Cool! I am very happy this issue finally gets some serious attention.

I think over time, the python-related aspects of Bazel need to be almost completely revamped. This could potentially be as significant as the rework of the cc_toolchain, and maybe reusing some of the platform-related work that is about to be implemented later this year.

For now, I am happy enough without a fix to this issue so long as
a) this issue is being addressed by someone with deep domain expertise in the py_runtime business, and
b) fixing this issue becomes part of the consideration for future python-related work in the long time evolution plan of Bazel.

Close it if you wish. We can always re-open this later.

qzmfranklin on 22 Mar 2019

@brandjon

Forgot to say: I completely agree with everything you mentioned in your last comment. What I wrote in the comment above is just stating my stance on it assuming a complete agreement with your statements.

qzmfranklin on 22 Mar 2019

@qzmfranklin Indeed we have been revamping Python support in Bazel, mainly the built-in support as it relates to the Python version, sandwiching with Starlark rules, and specifying the interpreter to use. See #7375 for Python-related toolchain work.

We may as well leave this bug open to track the inability to depend on binary rules via interpreter, which was news to me. When we start work on a fix (which likely will come after Bazel 1.0 work is done) we can move this to another bug.

brandjon on 22 Mar 2019

@brandjon sgtm

qzmfranklin on 22 Mar 2019

@brandjon Any update on this? I can confirm that this would be a big deal for the gRPC project and Google cloud libraries. Should we view the P3 label as a "not going to get priority" label?

gnossen on 13 Dec 2019

I think this is what I wanted to achieve with Bazel + Python:

Build the python interpreter and standard library from source.
Make building cpython/cython libraries Bazel-idiomatic (no more pip-wheel stuff).
Sandwich custom python rules and Bazel-native python rules.

qzmfranklin on 23 Dec 2019

Another use case for this is to enable building the fast cpp backend for grpc-python Bazel idiomatic. That is, without needing to resort various repository rules to create external repos such as @local_python_config.

qzmfranklin on 23 Dec 2019

I think @kku1993 has implemented something like this with rules_python: https://github.com/kku1993/bazel-hermetic-python, https://github.com/bazelbuild/rules_python/pull/312.

From what I see it was done by compiling python in patch_cmds of http_archive, which seems like a workaround, but still if it works it is better than what we have currently.
Also in the PR for rules_python support is added for specifying a custom python_interpreter_target (that you could also use with a precompiled python interpreter).

gergelyfabian on 20 May 2020

Thanks @gergelyfabian for pointing me to this thread!

My approach is to build the python interpreter from source using make instead of cc_binary. Incidentally, this produces a single binary that can be used for py_runtime.interpreter (I was not aware of this issue at the time).

I think using cc_binary is definitely the more preferred method for platform independence but as @gergelyfabian pointed out, the make method could be a workaround for now.

In https://github.com/bazelbuild/rules_python/pull/312, my goal is to specify a bazel target as the python interpreter for pip_import rule. As currently implemented, I believe it behaves like py_runtime.interpreter in that it only accepts labels that point to a single file (allow_single_file = True).