Bazel: Provide tempdir in genrule()

Created on 30 Oct 2019  ·  7Comments  ·  Source: bazelbuild/bazel

Description of the problem / feature request:

A genrule is sometimes used to wrap legacy build tooling. Such tooling sometimes demands a temp directory in which it can write scratch files.

A workaround is to get this is suggested on StackOverflow (link below). “(1) wrap the entire cmd body in a shell function, and call it (2) create a temp directory as the first thing in the function, store its path in some variable (3) add a trap to clean up the temp directory.”

That's not a great thing to reinvent at each point of use, and it may sometimes put the tempdir in a non-optimal place (like an expensive sandboxfs rather than fast cheap tmpfs).

Thus this feature request: provide a tempdir mechanism in genrule to:

  • obtain a temp dir, either $(temp) in the cmd, or in an environment variable
  • automatically cleanup tempdir contents (on success or failure)
  • enable other machinery (remote build executors, for example) to control where these temps land (i.e. the right filesystem etc for the situation).

Feature requests: what underlying problem are you trying to solve with this feature?

See above. Some tools demand a temp directory, and sometimes we need to wrap old things in a genrule() rather than create new Bazel rules for every part of a sprawling project.

Have you found anything relevant by searching the web?

https://stackoverflow.com/questions/55001748/do-bazel-genrules-offer-a-temp-directory

P3 team-Remote-Exec feature request

Most helpful comment

@davido Excellent. What do you think of the advice from the StackOverflow page commenter, "add a trap to clean up the temp directory"? Without that, and unless Bazel itself hooks mktemp (which as far as I know, it does not), it seems genrule2() as implemented above will leave tempfiles to accumulate until eventual OS-level cleanup (at restart, etc.)?

All 7 comments

An extra wrinkle: some tools may demand an absolute path of a tempdir, which is unfortunate as it violates the general Bazel guidance to use only relative paths.

Assigning to RemoteExec team, but this kind of thing is really a collaboration with LocalExec as well.
It is a pretty big design space.
I think the concept might apply to all rules, not just genrule. There are often helpers that will be use space at the path specified in $TMPDIR. If you know the behavior of the helper, you could tune the action to use temp space in memory or disk.

A couple more idea variants:

  • Provide two tempdirs to each rule: one that is small and fast, the other large and slow.
  • Provide one tempdir, but have a rule setting to state the desired size of the temp storage - which the strategy code would use to choose what kind of tempdir fs to provide.

Gerrit Code Review project extracted its genrule2 Starlark macro in bazlets repository that would defined ROOT and TMP for you and that you can use like this:

load("//tools/bzl:genrule2.bzl", "genrule2")

genrule2(
    name = "%s-static" % name,
    cmd = " && ".join([
        "mkdir -p $$TMP/static",
        "unzip -qd $$TMP/static $(location %s__gwt_application)" % name,
        "cd $$TMP",
        "zip -qr $$ROOT/$@ .",
    ]),
    tools = [":%s__gwt_application" % name],
    outs = ["%s-static.jar" % name],
)

The definition is here: [1].

def genrule2(cmd, **kwargs):
    cmd = " && ".join([
        "ROOT=$$PWD",
        "TMP=$$(mktemp -d || mktemp -d -t bazel-tmp)",
        "(" + cmd + ")",
    ])
    native.genrule(
        cmd = cmd,
        **kwargs
    )

[1] https://github.com/GerritCodeReview/bazlets/blob/master/tools/genrule2.bzl#L19-L28

@davido Excellent. What do you think of the advice from the StackOverflow page commenter, "add a trap to clean up the temp directory"? Without that, and unless Bazel itself hooks mktemp (which as far as I know, it does not), it seems genrule2() as implemented above will leave tempfiles to accumulate until eventual OS-level cleanup (at restart, etc.)?

As the matter of fact the quoted Starlark macro could be improved:

diff --git a/tools/bzl/genrule2.bzl b/tools/bzl/genrule2.bzl
index 3113022104..d0b0969438 100644
--- a/tools/bzl/genrule2.bzl
+++ b/tools/bzl/genrule2.bzl
@@ -21,6 +21,7 @@ def genrule2(cmd, **kwargs):
         "ROOT=$$PWD",
         "TMP=$$(mktemp -d || mktemp -d -t bazel-tmp)",
         "(" + cmd + ")",
+        "rm -rf $$TMP",
     ])
     native.genrule(
         cmd = cmd,

In case anyone comes along and needs this, it's also pretty easy to use trap to ensure the tempdir is always cleaned up. Here is a particularly helpful page about it.

https://www.putorius.net/using-trap-to-exit-bash-scripts-cleanly.html

Was this page helpful?
0 / 5 - 0 ratings

Related issues

mdzoba picture mdzoba  ·  3Comments

davidzchen picture davidzchen  ·  3Comments

sandipmgiri picture sandipmgiri  ·  3Comments

GaofengCheng picture GaofengCheng  ·  3Comments

meteorcloudy picture meteorcloudy  ·  3Comments