Tpot: distributed environment

Created on 21 Jun 2016  路  6Comments  路  Source: EpistasisLab/tpot

Are there any plans to allow tpot to be used in a distributed environments?

question

Most helpful comment

I'd love better parallel processing on a single machine. I feel sad when i see 3 cores at 0% and 1 at 100%

All 6 comments

Eventually, yes. There's been discussion of using Dask to parallelize TPOT (cc @tonyfast). We've also been thinking about PySpark for parallel cloud computing. However, we're still focused on getting the core algorithm and tool finished before we really work our way into scaling to distributed environments.

My common use case is parallel cloud computing and I think that in order for any interesting dataset to come in handy with TPOT it has to scale.
I might consider leaving it up to the user which one he prefers to use since he knows best the use case.

I'd love better parallel processing on a single machine. I feel sad when i see 3 cores at 0% and 1 at 100%

@minimumnz: it's fairly easy to make that change - https://github.com/teaearlgraycold/tpot/tree/parallelize

But TPOT itself likely won't have local parallelization until cluster support is also added, since it'd be much nicer to have both cases covered by one library.

May I ask what is the current priority level of using distributed computing libraries (ideally DASK, that comes with caching) in tpot? I think that's vital for such a project to be usable in the real world and it should be orthogonal to the "core" branch development.

I think that if we manage to represent the whole population of pipelines in a huge dask graph it would be a good start. Then, caching of intermediate results (with the current _development_ branch I'm spending most of the computing time recalculating the same xgboost!), multicore and multi-server would be hand in hand.
Any chance of reopening this issue?

I agree. Can you file a separate issue and list the possible options?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

rhiever picture rhiever  路  4Comments

KhaledSharif picture KhaledSharif  路  5Comments

chjq201410695 picture chjq201410695  路  4Comments

jonathanng picture jonathanng  路  3Comments

weixuanfu picture weixuanfu  路  5Comments