Bazel: API for Skylark rules and genrules to declare ResourceSets

Created on 23 Oct 2018  ·  17Comments  ·  Source: bazelbuild/bazel

Description of the problem / feature request:

My project uses multithreaded code emitters written in Java. They may eat CPU and RAM, and when running concurrently may overwhelm the build host. It'd be great if I could somehow signal to Bazel's scheduler that these processes are heavier so that it could adjust accordingly.

Feature requests: what underlying problem are you trying to solve with this feature?

When porting a project from Maven to Bazel, I observe longer build times with Bazel. Looking at the critical path, I see that a few actions--most notably one of the code emitters--takes significantly longer under Bazel (23s) than when run from the command line (15s) or Maven.

If I understand the code correctly (GenRuleAction.java), Bazel treats all genrules the same. They're assumed to consume ~300M of RAM and a single CPU. My tasks, however, typically occupy 2-4 CPUs, so I'd like Bazel to take this into account when scheduling actions running on my build host.

I'm still too new to Bazel to suggest a concrete API, but maybe something like

genrule(
    ...,
    resources=(512, 3.0, 0.0),  # Corresponds to ResourceSet: 512M, 3 CPUs, 0/default for I/O
)

# or

genrule(
    ...,
    resources_ram=512,
    resources_cpu=3.0,
)    

would be feasible? (Ditto for plain rules, too!)

What operating system are you running Bazel on?

Linux (CentOS 6.6)

What's the output of bazel info release?

$ bazel info release
release 0.16.1- (@non-git)

(This is 0.18.0rc8.)

If bazel info release returns "development version" or "(@non-git)", tell us how you built Bazel.

n/a. This is just a feature request.

What's the output of git remote get-url origin ; git rev-parse master ; git rev-parse HEAD ?

n/a. This is just a feature request.

Have you found anything relevant by searching the web?

988 touches on a similar topic.

Any other information, logs, or outputs that you want to share?

P3 team-Local-Exec feature request

Most helpful comment

I think I wasn't clear in my previous emails with @ulfjack: exec_properties isn't only for remote execution, it's for any execution strategy that is interested in the execution platform (and with the Execution Platforms vs Strategies proposal in development, that's all of them).

However, Bazel does not currently set any meanings for the keys in exec_properties: that is up to the execution strategy. Currently the keys used by Bazel's Remote Execution API are defined in the API description, and are only meaningful in the context of that strategy.

We could have the local execution strategies use some of these keys as well, but this raises the problem that Google's internal remote execution uses a completely different set of keys. We'd need to figure out a way to parameterize those keys to work with the existing internal/external remote execution systems. This makes things a bit harder for those of us working with both systems, and I'm sure is frustrating to people only using Bazel, but it does need to be considered for us.

All 17 comments

cc @ulfjack @jmmv

I think this makes a lot of sense. (The need for this came up as well during the Bazel hackathon but I cannot remember the details right now.)

I don't like the first proposal of using a tuple: unnamed parameters are very hard to read. I prefer either a dictionary with named quantities, the separate attributes, or maybe better, a higher-level data type to represent the resource set with named fields (a la collections.namedtuple).

I'd also try to avoid floats even if the backing ResourceSet supports them (because I think we should get rid of them). For CPU reservations, floats are very hard to deal with, and for I/O, we can just represent them as an integer in the [0,100] range.

Should this plug into the same/similar configuration of configuring
“requirements” from remote workers? Sounds related to configurability but I
might be wrong (cc @katre)
On Wed, 24 Oct 2018 at 4:44 Julio Merino notifications@github.com wrote:

I think this makes a lot of sense. (The need for this came up as well
during the Bazel hackathon but I cannot remember the details right now.)

I don't like the first proposal of using a tuple: unnamed parameters are
very hard to read. I prefer either a dictionary with named quantities, the
separate attributes, or maybe better, a higher-level data type to represent
the resource set with named fields (a la collections.namedtuple).

I'd also try to avoid floats even if the backing ResourceSet supports
them (because I think we should get rid of them). For CPU reservations,
floats are very hard to deal with, and for I/O, we can just represent them
as an integer in the [0,100] range.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/bazelbuild/bazel/issues/6477#issuecomment-432481790,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABUIF_TrjabaiQc1wv7QkV6FvDLCLECqks5un8XvgaJpZM4X10Sq
.

We already have a mechanism to configure cpu, but not one to configure ram. I don't think adding a mechanism to configure I/O makes any sense, given that the use of the "IO" resource in Bazel is inconsistent and more to address one-off restriction cases than as any sort of general I/O model. Basically, what I/O is is completely undefined and people have been abusing the setting to implement random restrictions. I think those are basically all wrong because they inherently don't scale. Bazel assumes a machine has "1.0 I/O", so a setting of 0.3 means "don't run more than 3 of these at the same time", regardless of whether I have an uber-powerful 52 core workstation with x-point high throughput memory in a raid-0 configuration for disk or a puny 2 core laptop with a spinning platter.

The way to configure cpu is to set tags = ["cpu:3"]. It's possible that this isn't properly processed everywhere, but that would be easy to fix. Arguably, using tags isn't ideal because it's not typesafe. However, adding this to all sorts of rules and actions as custom type-safe fields isn't ideal either, and I think this solution is a reasonable compromise. Certainly, if someone were to add "mem:" or similar, I would insist that the user _must_ provide a unit at the end, e.g., "mem:5g" or "mem:3kb".

Please do not assign issues to more than one team.

@beasleyr-vmw I think this is great, but is it also possible to specify cpu and ram in a dynamic way, similar to how local_ram_resources and local_cpu_resources can be set dynamically? I.e. something like HOST_CPUS-1 etc. As I am often used to setting multi-threaded apps based on the available host cpus available rather than a fully static amount.

Note that the cpu:n feature works only for tests, not other actions. From Ulf's comment above, I thought it was generally available, but it's not.

To do some local testing, I hacked in cpu:n support for Starlark actions, but in my initial testing of swift builds, the total build time was not better _for our build_ (which has many small modules). I'm not sure if I'll continue pursuing it. https://github.com/kastiglione/bazel/commit/0a1e78b34fd25205102600b7cdf7d47c2dd6652a

We have a big project with heavy compilations and heavy tests. Tests also require some resources which are not CPU or RAM (for example, GPU memory or special CPU mode).
For RAM-heavy tests we initially set exclusive tag. I've implemented ram:<int> tag just like cpu:<int> and it works like a charm. But there are some issues:

  • this option isn't flexible, I'd like to be able to multiply RAM requirements for all targets globally (for example, for sanitizers or code coverage runs)
  • this way of setting requirements by tags tends to become more complex if we'll add some special resources (like GPU mem)

So I guess we need two things:

  • adjustable memory estimations for test sizes instead of hardcoded ones (by option like --test_sizes=100M,500M,1G,5G)
  • some mechanism of describing resources/requirements in more flexible way

As for latter, I guess it may look something like this in rules:

cc_test(
  ...
  resources: {
    "cpu": "2",
    "ram": "2G",        # not an ideal solution, but may be useful 
    "gpu_ram": "2",
    ...
  }
  ...
)

As for new types of resources like gpu_ram we can use some new workspace rule like this:

resource(
  name = "gpu_ram",
  capacity = 16,
)

I think we can contribute adjustable memory estimations for now. A mechanism of describing custom resources looks much bigger.

I haven't looked too too deeply into this comment thread but the new Action Groups proposal might be of interest to you specifically the bit on exec_properties

@kastiglione

To do some local testing, I hacked in cpu:n support for Starlark actions, but in my initial testing of swift builds, the total build time was not better for our build (which has many small modules). I'm not sure if I'll continue pursuing it.

Were you setting it to mimic what was being passed in for -j? I assume that having that value match what swift is using would be best. I also assume that we would want the value to be dynamic, to reflect that some compiles will use fewer jobs.

@juliexxia
Sorry for the late response. Yes, exec_properties would cover the case of "custom resources" like GPU memory (actually, even without the proposal you mentioned), but AFAIK Bazel's local ResourceManager doesn't respect exec_properties. Maybe you know about any plans/proposals of supporting exec_properties in ResourceManager?

BTW is it a good idea to do that?

Ah, You are correct. Apologies for not more deeply understanding the needs of this post earlier. exec_properties is meant to be read by remote execution machinery and isn't supported by the local ResourceManager. Bazel is general doesn't do any real parsing of exec_properties, just forwards it along. I am not aware of any plans to make ResourceManager respect exec_properties and agreed, I'm not sure it's a great idea to do that.

Asking for the underinformed, but why is it a problem if the local ResourceManager respects certain exec_properties? If this is a needed piece of a functionality (and I'd argue that it is, bazel habitually underestimates my static CppLink actions to the point I might have to run a fork to make the memory estimate configurable), we'd need some way of specifying this. Is have a local_exec_properties and a remote_exec_properties better than declaring a single exec_properties which is interpreted to the best of a ResourceManager's abilities?

From the discussion so far it sounds like having ResourceManager interpret exec_properties would be exactly what we're looking for. @juliexxia what are the reasons why it might not be a good idea?

I'm in favor of making ResourceManager obey exec_properties where appropriate. It seems like a good way to unify local and remote execution, rather than treating them as inherently separate.

IIRC, @katre said in another thread that exec_properties is purely for remote execution; adding him here for visibility.

I think I wasn't clear in my previous emails with @ulfjack: exec_properties isn't only for remote execution, it's for any execution strategy that is interested in the execution platform (and with the Execution Platforms vs Strategies proposal in development, that's all of them).

However, Bazel does not currently set any meanings for the keys in exec_properties: that is up to the execution strategy. Currently the keys used by Bazel's Remote Execution API are defined in the API description, and are only meaningful in the context of that strategy.

We could have the local execution strategies use some of these keys as well, but this raises the problem that Google's internal remote execution uses a completely different set of keys. We'd need to figure out a way to parameterize those keys to work with the existing internal/external remote execution systems. This makes things a bit harder for those of us working with both systems, and I'm sure is frustrating to people only using Bazel, but it does need to be considered for us.

As far as I can see, we can go further in one of two ways:

  1. Design and support a predefined set of properties. In that case local or remote execution systems may support a subset of it, and each property should have some certain meaning.
  2. Let users describe resources in BUILD/WORKSPACE/bzl files. These descriptions may, for example, contain some executable scripts to interact with hardware or smth.
Was this page helpful?
0 / 5 - 0 ratings