Pants: Allow tags to be applied to targets from a single source

Created on 2 Mar 2019  路  12Comments  路  Source: pantsbuild/pants

Now that task targets can be filtered by tags (prior work from https://github.com/pantsbuild/pants/issues/6902), having a way of defining a single source for which tags to apply to which targets would be helpful to avoid needing to set the same tag across multiple BUILD files.

There are other use cases as well, such as grouping all targets with a particular suffix (or prefix) under a single tag (as noted for integration targets in pants being tagged with integration below), or others (fsqio, for instance, uses tags for rule enforcement on the build graph)

All 12 comments

Proposed Solution

Using https://github.com/pantsbuild/pants/issues/6902#issuecomment-446348080 as a starting point, a (global) subsystem would fit well for this purpose - taking a single option (the path to the source file) and providing a helper function for identifying which tags to apply to which targets. This would act as a supplement to any tags defined in the BUILD files themselves, and would not be coupled to tag filtering so that it could apply to a broader selection of use cases.

Plugging this into the LegacyPythonCallbacksParser would allow us to apply tags to targets during target initialization, though injecting an instance of a subsystem there has proven to be a challenge due to the problem of calling a global instance of a subsystem _before_ the Scheduler has been created (i.e. it fails on an UninitializedSubsystemError), which led me to the following TODO in global options:

https://github.com/pantsbuild/pants/blob/dea627097d8f4fbfdbb4ea48b282be2a4d5a60ce/src/python/pants/option/global_options.py#L46-L47

Not sure if there's a related issue for that, or if that may be resolved through anything currently in flight.

Which leads me here - looking for feedback on the problem / approach defined 鈽濓笍- in the meantime, I'm looking into other ways of (lazily) initializing the subsystem. There may be alternative approaches that would make sense for this problem as well (from what I can tell, possibly exposing a global option for the source file / path that could be fed to a service object).

For reference, WIP branch for the general approach I've taken so far here

Is it the case that these options need to be applied "across" scopes (ie, to multiple Tasks at once)? The example that you referenced here: https://github.com/pantsbuild/pants/issues/6902#issuecomment-446348080 ... looks like the configuration is still "per Task" or per scope.

If this is per scope, rather than using a global subsystem, it feels like you should be able to use the Subsystem that you added in #7275 to add regex support. That would invert the model from the linked comment... so the config (as it would be specified in pants.ini) would look more like:

[target-filter]
target_restrictions: {
    'blacklist': [<target regexes>], # Targets ignored by all tasks that support this subsystem
    ..,
  }

[target-filter.lint.scalastyle]
target_restrictions: {
    'blacklist': [<target regexes>], # Targets ignored by just the scalastyle task
    'whitelist': # etc
  }

@stuhood I think this question is orthogonal to the specific use-case of #7275 .

That change's subsystem provides the answer to "which target tags should cause me to skip the target". Whereas this change provides a way of tagging targets in a centralized way, without having to laboriously edit every relevant BUILD file.

True, #7275 is the motivating use-case, but there could easily be others. For example, a single mapping that says "all targets whose name ends with _integration should have the tag integration" could replace all our tags = {'integration'} boilerplate.

@benjyw : Ok, I think I see what you're getting at. Expanding the description with more of what we're trying to accomplish would be helpful then I think, because the goal was not clear to me. It sounds like it's approximately: "there is a global subsystem that can _add_ tags to targets based on regexes (?)".

Describing one or two usecases for this would be good too, because the _integration usecase can be satisfied in multiple ways.

True, specific use-cases can be satisfied in multiple ways. But I think it's better if we don't have to solve each case ad-hoc.

The general design here is: there can be K ways of adding tags to a target (e.g., inline, or in a central .json file, or ), and M uses of target tags (e.g., blacklisting unlintable targets, not running integration tests, or ).

Then we do K+M work instead of K*M work, yet get the added benefit of uniformity.

Kubernetes does something similar. You can apply arbitrary key-value labels to pods, and then select on those labels for various actions. https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/

@codealchemy can you edit your posts on this issue to clarify?

But back to the technical question: We'd like to have a subsystem that provides a .json file mapping tags to target addresses (or target address regexes). So we're trying to figure out how to get that into LegacyPythonCallbacksParser's ctor, because that's created before we have full options.

I'm hoping that we can just create it lazily, since it's not needed in the bootstrap sequence itself.

So, including these in the parser is one approach. I think that if you make lookup of the Subsystem instance lazy, that would work.

A few approaches

  1. Make lookup of the global subsystem lazy (as you mentioned), and then inject at parse time.
  2. Inject tags one level up in the @rules that hydrate "structs": https://github.com/pantsbuild/pants/blob/master/src/python/pants/engine/build_files.py#L138-L187 .. as I said, the naming is terrible. But: a struct "always" represents the un-instantiated placeholder for a target, and so you could include additional kwargs there.
  3. Inject tags when we're actually instantiating v1 Target objects here: https://github.com/pantsbuild/pants/blob/master/src/python/pants/engine/legacy/graph.py#L147-L172

Of these options, the first and second ones are cleanest, and more forwards compatible (they work across both v1 and v2). The third option (injecting in BuildGraph) will _only_ work for v1, because v2 does not create a BuildGraph.

Unfortunately, it looks like there are some minor blockers to taking the second approach (related to #6880: we don't actually inject the OptionsBootstrapper in the codepath that hydrates targets for v1). So I think either one or three would be paths forward for now (possibly with a TODO mentioning consuming the tags via optionable_rule).

@stuhood it looks like making the lookup lazy runs into the same issue in option 1 and 3 (seen running ./pants2 test tests/python/pants_test/build_graph:build_graph):


Ex. test error from option 1

self = <pants_test.build_graph.test_build_graph.BuildGraphTest testMethod=test_transitive_subgraph_of_addresses_bfs_predicate>

    def test_transitive_subgraph_of_addresses_bfs_predicate(self):
      root = self.inject_graph('a', {
        'a': ['b', 'c'],
        'b': ['d', 'e'],
>       'c': [], 'd': [], 'e': [],
      })

.pants.d/pyprep/sources/3842a10025edd42c67b413781e754c8e6620a39d/pants_test/build_graph/test_build_graph.py:257:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
.pants.d/pyprep/sources/3842a10025edd42c67b413781e754c8e6620a39d/pants_test/build_graph/test_build_graph.py:44: in inject_graph
    self.build_graph.inject_address_closure(root_address)
.pants.d/pyprep/sources/3842a10025edd42c67b413781e754c8e6620a39d/pants/engine/legacy/graph.py:206: in inject_address_closure
    self.inject_addresses_closure([address])
.pants.d/pyprep/sources/3842a10025edd42c67b413781e754c8e6620a39d/pants/engine/legacy/graph.py:214: in inject_addresses_closure
    for _ in self._inject_specs(specs):
.pants.d/pyprep/sources/3842a10025edd42c67b413781e754c8e6620a39d/pants/engine/legacy/graph.py:271: in _inject_specs
    subjects)
/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/contextlib.py:35: in __exit__
    self.gen.throw(type, value, traceback)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <pants.engine.legacy.graph.LegacyBuildGraph object at 0x1042aba90>

    @contextmanager
    def _resolve_context(self):
      try:
        yield
      except Exception as e:
        raise AddressLookupError(
>         'Build graph construction failed: {} {}'.format(type(e).__name__, str(e))
        )
E       AddressLookupError: Build graph construction failed: ExecutionError 1 Exception encountered:
E       Computing Select(Specs(dependencies<Exactly(tuple)>=(SingleAddress(directory=u'a', name=u'a'),), matcher<Exactly(SpecsMatcher)>=SpecsMatcher(tags<Exactly(tuple)>=(), exclude_patterns<Exactly(tuple)>=())), Exactly(TransitiveHydratedTargets))
E         Computing Task(transitive_hydrated_targets, Specs(dependencies<Exactly(tuple)>=(SingleAddress(directory=u'a', name=u'a'),), matcher<Exactly(SpecsMatcher)>=SpecsMatcher(tags<Exactly(tuple)>=(), exclude_patterns<Exactly(tuple)>=())), Exactly(TransitiveHydratedTargets), true)
E           Computing Task(addresses_from_address_families, Specs(dependencies<Exactly(tuple)>=(SingleAddress(directory=u'a', name=u'a'),), matcher<Exactly(SpecsMatcher)>=SpecsMatcher(tags<Exactly(tuple)>=(), exclude_patterns<Exactly(tuple)>=())), Exactly(BuildFileAddresses), true)
E             Computing Task(parse_address_family, Dir(path=a), Exactly(AddressFamily), true)
E               Throw(Failed to parse a/BUILD:
E       Subsystem "TargetTagDefinitions" not initialized for scope "target-tag-definitions". Is subsystem missing from subsystem_dependencies() in a task? )
E                 Traceback (most recent call last):
E                   File "/Users/schmitt/Workspace/pants/.pants.d/pyprep/sources/3842a10025edd42c67b413781e754c8e6620a39d/pants/engine/native.py", line 422, in extern_generator_send
E                     res = c.from_value(func[0]).send(c.from_value(arg[0]))
E                   File "/Users/schmitt/Workspace/pants/.pants.d/pyprep/sources/3842a10025edd42c67b413781e754c8e6620a39d/pants/engine/build_files.py", line 61, in parse_address_family
E                     address_mapper.parser))
E                   File "/Users/schmitt/Workspace/pants/.pants.d/pyprep/sources/3842a10025edd42c67b413781e754c8e6620a39d/pants/engine/mapper.py", line 55, in parse
E                     raise MappingError('Failed to parse {}:\n{}'.format(filepath, e))
E                 MappingError: Failed to parse a/BUILD:
E                 Subsystem "TargetTagDefinitions" not initialized for scope "target-tag-definitions". Is subsystem missing from subsystem_dependencies() in a task?

.pants.d/pyprep/sources/3842a10025edd42c67b413781e754c8e6620a39d/pants/engine/legacy/graph.py:238: AddressLookupError


Ex. test error from option 3

self = <pants_test.build_graph.test_build_graph.BuildGraphTest testMethod=test_transitive_subgraph_of_addresses_bfs_predicate>

    def test_transitive_subgraph_of_addresses_bfs_predicate(self):
      root = self.inject_graph('a', {
        'a': ['b', 'c'],
        'b': ['d', 'e'],
>       'c': [], 'd': [], 'e': [],
      })

.pants.d/pyprep/sources/405972c62321192a42f786a306a3b8089243e7b4/pants_test/build_graph/test_build_graph.py:257:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
.pants.d/pyprep/sources/405972c62321192a42f786a306a3b8089243e7b4/pants_test/build_graph/test_build_graph.py:44: in inject_graph
    self.build_graph.inject_address_closure(root_address)
.pants.d/pyprep/sources/405972c62321192a42f786a306a3b8089243e7b4/pants/engine/legacy/graph.py:211: in inject_address_closure
    self.inject_addresses_closure([address])
.pants.d/pyprep/sources/405972c62321192a42f786a306a3b8089243e7b4/pants/engine/legacy/graph.py:219: in inject_addresses_closure
    for _ in self._inject_specs(specs):
.pants.d/pyprep/sources/405972c62321192a42f786a306a3b8089243e7b4/pants/engine/legacy/graph.py:278: in _inject_specs
    self._index(thts.closure)
.pants.d/pyprep/sources/405972c62321192a42f786a306a3b8089243e7b4/pants/engine/legacy/graph.py:100: in _index
    new_targets.append(self._index_target(target_adaptor))
.pants.d/pyprep/sources/405972c62321192a42f786a306a3b8089243e7b4/pants/engine/legacy/graph.py:133: in _index_target
    target = self._instantiate_target(target_adaptor)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <pants.engine.legacy.graph.LegacyBuildGraph object at 0x112407b10>
target_adaptor = TargetAdaptor(address=a)

    def _instantiate_target(self, target_adaptor):
      """Given a TargetAdaptor struct previously parsed from a BUILD file, instantiate a Target.

        TODO: This assumes that the SymbolTable used for parsing matches the SymbolTable passed
        to this graph. Would be good to make that more explicit, but it might be better to nuke
        the Target subclassing pattern instead, and lean further into the "configuration composition"
        model explored in the `exp` package.
        """
      target_cls = self._target_types[target_adaptor.type_alias]
      try:
        # Pop dependencies, which were already consumed during construction.
        kwargs = target_adaptor.kwargs()
        kwargs.pop('dependencies')

        # Append any globally defined tags to the target
        kwargs.setdefault('tags', set())
        kwargs['tags'].update(self._target_tag_definitions().tags_for(target_adaptor.address.spec))

        # Instantiate.
        if issubclass(target_cls, AppBase):
          return self._instantiate_app(target_cls, kwargs)
        elif target_cls is RemoteSources:
          return self._instantiate_remote_sources(kwargs)
        return target_cls(build_graph=self, **kwargs)
      except TargetDefinitionException:
        raise
      except Exception as e:
        raise TargetDefinitionException(
            target_adaptor.address,
>           'Failed to instantiate Target with type {}: {}'.format(target_cls, e))
E       TargetDefinitionException: Invalid target BuildFileAddress(a/BUILD, a): Failed to instantiate Target with type <class 'pants.build_graph.target.Target'>: Subsystem "TargetTagDefinitions" not initialized for scope "target-tag-definitions". Is subsystem missing from subsystem_dependencies() in a task?

.pants.d/pyprep/sources/405972c62321192a42f786a306a3b8089243e7b4/pants/engine/legacy/graph.py:177: TargetDefinitionException

For reference, same working branch (https://github.com/codealchemy/pants/commit/3f7c1086e3670178ceaaf0bcf7f109030707a9b4 and https://github.com/codealchemy/pants/commit/495b2a594a72158ad4e94a9b58f9e0ad96366c15, respectively)

IIUC it looks like the build graph is parsed there before we have subsystems as well, is that right?

@codealchemy : It occurs to me that there is an existing subsystem that you can either:

  1. model after
  2. just use!

https://github.com/pantsbuild/pants/blob/3a81e86eca33bcd16ca68838052ffeb25fdcc803/src/python/pants/build_graph/target.py#L88-L101

But actually, it's possible that that subsystem is only consumed within BuildGraph parsing... I think that to guarantee that a Subsystem is loaded, as early as possible, you'd need to include it here: https://github.com/pantsbuild/pants/blob/3a81e86eca33bcd16ca68838052ffeb25fdcc803/src/python/pants/init/options_initializer.py#L85-L92

The new subsystem was added to GlobalSubsystems but still wasn't available in the parser. Are we sure Arguments actually works...? :)

Was this page helpful?
0 / 5 - 0 ratings