Pants: Re-enable concurrent runs for pantsd in v2

Created on 3 May 2019  路  11Comments  路  Source: pantsbuild/pants

pantsd currently prevents concurrent runs because of global mutable singletons that were pervasive in v1. In v2, there are none of these left, and so only concurrent access to stdio needs to be managed.

We would like to allow for concurrent runs under pantsd, without the use of the PANTS_CONCURRENT=True flag (ie, that flag would become a noop and then be deprecated).


A sketch of what this will likely involve, in at least four PRs:

  1. ~Move to the rust nailgun client, to avoid needing to implement any client-side logic twice.~ Fixed in #11147.

    • Will involve re-landing #10865.
  2. ~cancellation needs to move to heartbeat-based~

    • ~Add a "canceled" bool/sync primitive (probably a watch) to Session, and propagate it in from the read half of the connection closing.~
    • ~Client closes the write half of the socket, then waits for the server to close the other half (might require additional support in nails).~
    • ~Cancellation bool consumed in all relevant places:~

      • ~InteractiveProcess: should move to spawning the process and then asynchronously waiting for completion.~

      • ~Graph: should cancel ongoing work if the client/Session that started it goes away (and let any existing clients restart it).~

    • Document this new behavior of the pantsd lifecycle somewhere.
  3. remaining singletons need to be located and fixed

    • ~All access to stdio in both the rust code and the python code should be replaced with access to Session-specific files, and the sys.std* file handles should be closed, poisoned, or replaced with synthetic thread-local files (谩 la).~

      • This will also require fixing #11398.

    • ~All remaining usages of Subsystem.global_instance should be removed.~
  4. allow concurrent runs in DaemonPantsRunner

    • Make PantsDaemonCore a container for multiple Schedulers (rather than the current singular Scheduler) to allow for concurrent access with different options. Also, consider porting the concurrent Scheduler management to Rust?
    • Remove the DaemonPantsRunner._one_run_at_a_time lock.
Q42020-idea pantsd

Most helpful comment

Have (finally! sorry!) picked this back up again today. Have begun extracting thread/task-local IO destinations into a crate independent of logging so that raw stdio can be sent to a thread/task-local replaceable destination.

I've also determined that I'll need to follow up on https://github.com/mitsuhiko/console/issues/34 in order to allow the UI to write to a specific Write handle rather than to a hardcoded std::io::{stdout,stderr}.

I'm hoping to be able to finish item 2 from the description this week, and to tackle the final work of item 3 next week.

All 11 comments

Got another request for this. Possibly worth tackling pre-2.0 in case there are API changes necessary? Unclear.

Getting ready to pick this back up! Updated the description with a sketch of the likely implementation steps.

This is still a priority, but it is paused until the impacts of #11223 have stabilized: in particular, until #11252 is fixed.

11241 is a blocker here, but it will need a deprecation for Subsystem.global_instance(), and a replacement for Subsystem.get_streaming_workunit_callbacks.

v1 Subsystem facilities were removed in #11424. Next stop are the Python-level sys globals, including stdio. This is still on track to make it into a release before the end of the month.

I'll be resuming this tomorrow morning.

I'm exploring potential strategies for stdio replacement, but the most promising is probably lifting and solidifying the existing NativeWriter implementation to use it as pantsd's sys.stdout/stderr with a Session-set threadlocal variable underneath. env and args we can likely just poison, so that no one tries to consume them directly.

Have (finally! sorry!) picked this back up again today. Have begun extracting thread/task-local IO destinations into a crate independent of logging so that raw stdio can be sent to a thread/task-local replaceable destination.

I've also determined that I'll need to follow up on https://github.com/mitsuhiko/console/issues/34 in order to allow the UI to write to a specific Write handle rather than to a hardcoded std::io::{stdout,stderr}.

I'm hoping to be able to finish item 2 from the description this week, and to tackle the final work of item 3 next week.

Thread-local stdio should now be fully implemented, but my draft #11536 managed to (very early in the build process!) find one more place where we fork processes: the PluginResolver code.

I spent today preparing to use a "bootstrap" Scheduler (as described in #10360) to replace the in-process usage of Pex: https://github.com/pantsbuild/pants/blob/945c010d577a506c3b455e04a0b6165d3d5ec78f/src/python/pants/init/plugin_resolver.py#L100-L110 ... and should be able to get a PR out early next week to do that.

More progress on this. A draft of the plugin resolution change is out at #11568. Unfortunately, I'm out for the next 24 hours, and will have some planning work to do next week: am hoping to get both #11568 and #11536 green before the end of the week, but if not this might slip a bit further.

Sorry for all of the delays!

Hey folks! In the homestretch on #11536, with two test failures and a console-width-reporting issue to look into. Very optimistic that it will land next week. Once it does, I'm optimistic that the final PR for this change will be significantly smaller. Thanks for your patience.

11536 landed yesterday, and appears stable so far. I've drafted the change to enable concurrent runs, and so far it looks good! Very optimistic that this will be able to land before the end of the week.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

stuhood picture stuhood  路  5Comments

jsirois picture jsirois  路  3Comments

ns-cweber picture ns-cweber  路  7Comments

michaelgmiller picture michaelgmiller  路  3Comments

adabuleanu picture adabuleanu  路  6Comments