Pants: `./pants dependees` should be deprecated in favor of a "root selection" option

Created on 13 Sep 2018  路  10Comments  路  Source: pantsbuild/pants

./pants changed and ./pants list-owners were converted away from being pants goals and toward being "root selection" options, and we should likely do the same for dependees.

So:

# rather than...
./pants dependees --transitive $target
# ...you'd run something like
./pants --dependees=transitive list $target

There are a few reasons for this:
1) it's more flexible, because you could use it with other goals: ie: ./pants --dependees=direct test $target.
2) there is a possible alignment-with/replacement-for the --changed-include-dependees flag, which represents a parallel implementation of the concept in TargetRootsCalculator
3) it would be cacheable in pantsd sooner than if we ported ./pants dependees into a --v2 task (as in this sketch)

One open question though is whether the --dependees option would include/exclude the original roots. That might also affect naming: if modeled after --changed-include-dependees, it might be a ternary option:

--changed-include-dependees=<str> (one of: [none, direct, transitive] default: 'none')
engine

Most helpful comment

I think a more general-purpose query language would be fantastic. I would really love it if it was linked to a specific type of graph traversal rule we could register which would be interpreted as a query string very similarly to e.g. jq, so we could do ./pants --query 'rdeps(::, my:target, 1)' list instead of making it a separate goal, and then, in my wildest dreams, ./pants --query 'rdeps(::, my:target, 1) | filter(type=junit_tests)' test (again, very similarly to jq) (where rdeps and filter are both registered as e.g. @query_rules somewhere -- this allows us to incrementally implement parts of e.g. the bazel query spec). I'm looking through the bazel query language spec, but I think this could feasibly be interpreted as a strict extension of it.

All 10 comments

Personally, I find the root selection flags pretty confusing as a user interface. I find it much easier to understand what's going to happen with:

./pants dependees --transitive foo:: | xargs pants test
than
./pants --dependees=transitive test foo::

This becomes much more true when we talk about chaining them:
./pants changed | xargs ./pants dependees --transitive | xargs ./pants test
is pretty obvious, easy to reason about, and clear where to get --help for each part;
./pants --changed-include-dependees=transitive test
much less so.

The work should be cached in the daemon, so these should be equivalent work. The only performance difference would be that the separate commands force the commands to run sequentially, so we can't start using spare threads towards the end of the first invocation to be running work for the second...

I agree with @illicitonion here, although I wouldn't be opposed to a unified target selection method either. Some ideas:

  1. ./pants --select 'dependees(src/python/pants/base::) owners_of(src/resources/org/pantsbuild/ini.txt)' <goals>
  2. ./pants select <rest>

For either 1 or 2 we have a small query language that is already populated with 3 selectors and we'd fold some current address selecting tasks in to the language as demonstrated in 1 above:

  1. exact address
  2. siblings selector
  3. descendents selector
  4. dependees(..., direct|transitive)
  5. owners_of(...)
  6. changed(, ...?)

You could imagine as an option for 1 when leaving off a value for the --select option and as the only way 2 worked, dropping to a local line editor or even repl that allowed composition of the query, perhaps with assists like we currently get for free from the shell for the path portions of addresses.

I'd be in favour of building a more general purpose graph query language, but if we're going to do so, I would be interested in re-using the one in Bazel (formally specified in https://docs.bazel.build/versions/master/query.html and with a how-to in https://docs.bazel.build/versions/master/query-how-to.html) which was also adopted by Buck (https://buckbuild.com/command/query.html). It doesn't support changed, but I believe has support for everything in @jsirois's list :)

(I'll also note that this was implemented in Buck as an intern project, and is generally a pretty perfect intern project for the right intern - about the right size, well defined, well bounded, and some exciting computer science, if anyone happens to have any interns coming up in the near future...)

Yeah - that would be wonderful.

While I agree that a query language would be a good area to explore, it is closer to the description of this ticket than the "unix pipes between processes" approach that pants started from.

By which I mean: adding a query language is out of scope for this ticket, but this ticket heads in the right direction to replace the root selection flags with a query language.

If we're looking to go in the right direction, I'd suggest we add hard-coded support for a small number of graph API query language, rather than some extra flags. e.g. we could manually support: ./pants query 'rdeps(::, my:target)' for transitive deps and ./pants query 'rdeps(::, my:target, 1) for intransitive deps, without supporting a whole query language (i.e. with no nesting support or other operators).

I think a more general-purpose query language would be fantastic. I would really love it if it was linked to a specific type of graph traversal rule we could register which would be interpreted as a query string very similarly to e.g. jq, so we could do ./pants --query 'rdeps(::, my:target, 1)' list instead of making it a separate goal, and then, in my wildest dreams, ./pants --query 'rdeps(::, my:target, 1) | filter(type=junit_tests)' test (again, very similarly to jq) (where rdeps and filter are both registered as e.g. @query_rules somewhere -- this allows us to incrementally implement parts of e.g. the bazel query spec). I'm looking through the bazel query language spec, but I think this could feasibly be interpreted as a strict extension of it.

Again I would recommend looking at the jq manual, as it describes a language that I suspect more people are familiar with offhand (via jq) than the bazel query language (for example), and may not be that incompatible with it. jq also accepts e.g. an input file with filters, and has functions, etc. I'm not saying it should be used as a reference over the bazel query language, just that it's another strong contender and happens to be written with interactive command-line usage in mind from the start.

It might also be Really Nice if you could do e.g.

get_my_deps = QueryPipeline([
  RDeps('::', 1),
  Filter(type=JunitTests),
])

@rule(A, [HydratedTargets])
def f(targets):
  selected_targets = yield Get(TransitiveHydratedTargets, QueryPipeline, get_my_deps(targets))
  ...

where QueryPipeline has a __call__() override, and RDeps and Filter were registered as e.g. @query_rules. These are all just wild ideas, but I think this level of integration is something we can start with and that it will lead to much less code, and much more maintainable code, which needs less integration testing later on. I don't think it will be hard at all to generate use cases that actually can make use of these, and I think the integration into @rules would be extremely useful.

We can perhaps make the yield Get(...) return something besides TransitiveHydratedTargets so that you can't make queries in multiple calls to the engine (nudging users towards more efficiently using a single QueryPipeline when possible) (but maybe we do want to allow that, this is all supposition).

Resolving in favor of #7346.

Was this page helpful?
0 / 5 - 0 ratings