Following up on a conversation in https://github.com/mrdoob/three.js/pull/18123, this is a proposal for a utility class for managing tasks distributed to Web Workers. Currently BasisTextureLoader, DRACOLoader, and OBJLoader2 implement redundant logic for that purpose.
A full-featured library for this purpose could easily get pretty complex. I'm hoping that we can write something fairly lightweight for internal use by threejs examples. A more robust version could probably be a standalone library, or part of ECSY, but that's beyond the scope I personally want to attempt.
Proposed API:
interface TaskManager {
/** Returns true if support for the given task type is available. */
supportsType( type: string ): boolean;
/** Registers functionality for a new task type. */
registerType( type: string, init: Function, execute: Function ): TaskManager;
/** Provides initialization configuration and dependencies for all tasks of given type. */
initType( type: string, config: object, transfer: Transferrable[] ): TaskManager;
/** Queues a new task of the given type. Task will not execute until initialization completes. */
addTask( type: string, cost: number, config: object, transfer: Transferrable[] ): Promise<any>;
/** Destroys all workers and associated resources. */
dispose(): TaskManager;
}
Use in DRACOLoader:
const DRACO_DECODE = 'draco/decode';
class DRACOLoader {
constructor ( loadingManager: LoadingManager, taskManager: TaskManager ) {
this.loadingManager = loadingManager || DefaultLoadingManager;
this.taskManager = taskManager || DefaultTaskManager;
if ( ! this.taskManager.supportsType( DRACO_DECODE ) ) {
this.taskManager
.registerType( DRACO_DECODE, taskInit, taskExecute )
.initType( DRACO_DECODE, decoderConfig, decoderTransfers );
}
}
load ( url, onLoad, onProgress, onError ) {
const data = await fetch( url ).then( r => r.arrayBuffer() );
const cost = data.byteLength;
const config = { data, ... };
this.taskManager
.addTask( DRACO_DECODE, cost, config, [ data ] )
.then( onLoad ).catch( onError );
}
}
These two functions are passed to the TaskManager, and their function bodies are copied into the Web Worker (without surrounding context).
// Sets up state before a worker begins taking Draco tasks.
function taskInit ( config: object ): Promise<void> {
// do async setup
return Promise.resolve();
}
// Executes on a worker for each Draco task.
function taskExecute ( taskConfig: object ): Promise<any> {
// do expensive processing
return Promise.resolve( result );
}
The TaskManager class then takes on the responsibilities of:
/cc @kaisalmen
@donmccurdy regarding the question of ES6 module support in web workers:
Some preliminary thoughts:
TaskManager should cache a Worker of a specific type, so it can be directly re-usedOBJLoader2)Demo idea:
I feel it's not a task "manager" as such, it's more like a task factory. Or some kind of execution framework. But it does seem nice to be able to unify this kind of functionality.
Personally I would go for something that allowed a bit more functionality, such as dependencies for instance. Things like being able to abort the task. Not necessarily right now, but to have a provision to be able to add those things in the future. As things stand - I don't see that being possible, except via config, but that is kinda messy.
Maybe have a minimalist abstraction "Task" with something like "promise" or "completion" property as a Promise. You can then tack things like dependencies and various task-specific functionalities onto it.
Personally I would go for something that allowed a bit more functionality, such as dependencies for instance. Things like being able to abort the task.
With regard to dependencies and management of more complex code or execution functions support for jsm in workers will make things easier in the future. Especially with to Chrome 80+ (https://www.chromestatus.com/feature/5761300827209728) is supposed to support this without flags, but it is broken again in Chrome Dev 81 / Canary 81. I saw this working nicely on Windows already. Basically you can run the same code in the same way now as long as you obey the general limitations of workers. Porting existing code (like other loaders) to workers will be a lot easier, I think.
@donmccurdy and me agreed that we start development on a branch here and not in a separate repo. DRACOLoader and OBJLoader2 are first candidates for testing this new feature, but it is not limited to it of course.
TaskManager should cache a Worker of a specific type, so it can be directly re-used
I was hoping TaskManager could be pre-initialized with a set of possible tasks, then spin up N workers each capable of doing any of them. This would make it easier to balance work across many task types, within the browser's worker count limits. It may be a little tricky to set up.
With regard to dependencies and management of more complex code or execution functions support for jsm in workers will make things easier in the future
I hope so, but I worry it will be hard to make this compatible with bundlers. Maybe we can start simply and build up dependency support when needed. In any case there will need to be some way to ship (1) the core JS logic of the task, and (2) WASM dependencies. It won't be hard to _allow_ ES module imports, but loaders provided by threejs may need to avoid using that feature, for portability.
DRACOLoader and OBJLoader2 are first candidates for testing this new feature
I'm also in the process of writing KTX2Loader (https://github.com/mrdoob/three.js/pull/18490), which will need it.
Maybe have a minimalist abstraction "Task" with something like "promise" or "completion" property as a Promise
I like this. 👍
I'd also like to allow the possibility of setting .maxWorkers = 0 to run tasks in the main thread, if we can. This will make it easier for users on NodeJS, or with a Content Security Policy that prevents us from creating workers, to still use these loaders (more slowly).
I'll try to start a branch fairly soon, based mostly on a bit more abstraction around what DRACOLoader does today...
I hope so, but I worry it will be hard to make this compatible with bundlers. Maybe we can start simply and build up dependency support when needed.
Yes, jsm in workers should not be requirement. I consider it an extra option.
I'd also like to allow the possibility of setting .maxWorkers = 0 to run tasks in the main thread, if we can.
Yes, fallback option is a good idea.
I am in the process of establishing a three.js-tuesday-evening, so my response time/progress reporting becomes a little more predictable.
Life-sign from me. I understood why I broke obj2 jsm worker exec. PR is now available: #18886. I want to have both code path (legacy worker and jsm worker) supported here if possible from the beginning and I thought there was a bigger issue, but there is none. Relative locations must be expressed correctly. 😳
Hey, I finally started working on the implementation.
What do you think about adding a third function to register that handles communication in the worker like calling init and execute and afterwards messaging it back:
registerType( type: string, init: Function, execute: Function, manageCom: Function ): TaskManager;
This way you can separate the execution logic from the worker communication needs. In OBJLoader2Parallel the com-implementation is very heavy, but the obj parser is completely free of worker specific code. In DRACOLoader the worker feedback code and parsing logic is mixed. This could be the middle way.
First goal is a simple prototype example that capture complete tour through all functions of TaskManager.
I did some initial work a bit ago but haven't looked at it in a month or so, see https://github.com/mrdoob/three.js/compare/dev...donmccurdy:feat-taskmanager. It works with a simple test case (see the unit test) and with DRACOLoader so far, based on init and execute functions attached to the task object.
The word "manage" doesn't mean much to me... when is manageCom called, with what? If OBJ parsing requires sending incremental progress updates, perhaps the execute function should have access to something that can send those?
The word "manage" doesn't mean much to me... when is manageCom called, with what?
Sorry, this was too unspecific and the wrong term. My idea is to encapsulate all Worker <-> Main based communication into this "comRouter" function (optionally). The OBJLoader2Parser has callback functions (mesh ready, parse done) which allows to use the same unaltered code in the worker and outside the worker. The "comRouter" delegates the "init" and "execute" message to the functions and transport (intermediate) results back.
But, this is optional and should not contradict your initial proposal. Hope this made my idea clearer.
Edit: Just had a look at your code. Your TaskWorker is handling the communication (what I call "comRouter"), correct? You glue all the code of the different tasks into one big piece that becomes the Worker.
Am able to make all eight logical cores 100% busy: One simple worker (10^8 additions in for loop), 8 instance, 1000 added tasks . I have not modified any loader code, because I wanted to get the concept straight, first. This is still WIP. Sorry, things move very slow, but spare time is more rare than usual for me these days.
There is finally something to look at:
https://raw.githack.com/kaisalmen/three.js/TaskManagerProto/examples/webgl_loader_taskmanager.html
All basic functionality of TaskManager is implemented. I also added dependency loading for non-jsm workers. In the above example eight workers of the same kind are created, They contain three.js, on exec produce spheres with randomly transformed vertices. Buffers are transported to main. There meshes are created from buffers and put to scene. 25000 executions are triggered. Only the last 500 objects are kept, previous ones are deleted, but there is still a memory leak. Beware, it eats 4GB of memory over time during execution.
Of course, This is still work in progress...
Here we are. This is what should come next. Feedback requested: 😄
initType and addTask config transferables are properly processed. Costs need to be covered as well.OBJLoader2 works heavily enforced the design of the worker init and execution in the past.PR is not yet ready. I needed to clean util functions and update jsdoc which took longer than expected. Will get there soon (status; https://github.com/kaisalmen/three.js/tree/TaskManagerProto) and let you know...
Legacy + library dependency embedding, jsm workers, plus to-main fallback with legacy workers is now all working. I fixed and enhanced more things than I thought I would. TaskManager doc is not fully complete.
https://raw.githack.com/kaisalmen/three.js/TaskManagerProto/examples/webgl_loader_taskmanager.html (The rendering getting stuck is because of 250 execution fake worker on main).
or part of ECSY,
Just curious, how does worker management relate to ECSY? I thought that was a component-entity system, and nothing about workers?
An Entity-Component System is (among other things) a way of structuring and scheduling work: Systems define the order in which their logic runs. Events no longer fire arbitrarily, but instead are queued and processed in a priority order.
That sort of opinionated structure could (in my opinion) make it easier to move work into other threads. See Data Structures for Entity Systems. That's not to say it's easy, or that ECSY should necessarily do this. But it's possible, and ECSY wouldn't be the first ECS to go that direction.
But to clarify — I'm not interested in building anything that complex in the TaskManager proposal here. Just a minimal task runner that supports dependencies, and little else.