Meteor-feature-requests: Multi Core support with Worker Threads

Created on 3 Sep 2019 · 20Comments · Source: meteor/meteor-feature-requests

NodeJS v12 has the support for worker_threads with SharedArrayBuffer.

SharedArrayBuffer can be used for common data cache across the worker_threads to use all CPU cores of the machine instead of relying on Single Core CPU instance or Multiple docker instances/PODS always.

Ref Link :- https://nodejs.org/api/worker_threads.html#worker_threads_worker_threads

In Discussion

Source

corporatepiyush

👍13

Most helpful comment

Hi @nathanschwarz I believe the first discussion here would be which part of Meteor we should try this feature first.

Meteor is a huge code base so we would need to use Worker Threads in baby steps.

Maybe a good criteria would be which part could benefit most of this feature.

filipenevola on 12 Nov 2020

👍3

All 20 comments

This kind of improvement would really help with scaleability, a widely known challenge with Meteor apps; I've opened another issue along similar lines: https://github.com/meteor/meteor/issues/10677

chrisbobbe on 5 Sep 2019

👍3

please reopen this... stale bot is a pain...

armellarcier on 11 Jan 2020

Hopefully putting the issue in a milestone will keep it from being marked as stale…

benjamn on 11 Jan 2020

Any news about this feature ?

nathanschwarz on 4 Nov 2020

👍1

👍

kakadais on 12 Nov 2020

Hi @nathanschwarz I believe the first discussion here would be which part of Meteor we should try this feature first.

Meteor is a huge code base so we would need to use Worker Threads in baby steps.

Maybe a good criteria would be which part could benefit most of this feature.

filipenevola on 12 Nov 2020

👍3

@filipenevola Fully agree.
I believe that Meteor is one of the best platform for Microservices and it would be good to go start it from the multi-core supply such as Worker thread.
meteorhack:cluster package's strategy was good and working.
But I think this is treated as a base support because of its important.

kakadais on 12 Nov 2020

@kadasais meteorhack:cluster is based on the native node js cluster module I believe. It's basically forking (multi-core but single threaded) not thread workers.
meteorhack:cluster also works on the client using web workers.

I've finished implementing multi-core on the server using cluster a few days ago: It's quite straight forward.
The main downside right now is that we can't start Meteor as a serverless process.
But you can still use a single port to communicate between the processes when forking.
I've also made a serverless fork of Meteor using an environment flag to avoid the http server to start-up.

I'm using 2 types of multi core processes right now which are backed with a simple mongodb job queue :

A mailer task to send html emails generated through React.
A cache Engine to pre-render 25 000 * 9 languages / regions * 3 different devices in react (super heavy work).

I'm planning to make a third one for automated DB backups on an external ftp server.

@filipenevola obviously for me the best place to start would be on the server.

Multi core on the server would be "relatively simple" to build with the cluster module.

We could also leverage the native worker_threads module (with shared memory built in) but it's less straightforward because it would require to include the worker code into the build phase and replace the filepath in the master.

we would need new Worker('...worker_path.js') to become new Worker('...worker_path_after_build.js')

nathanschwarz on 13 Nov 2020

@nathanschwarz Could you explain a bit more for 'Can't start Meteor as a serverless process'?
I think the forking is enough to build a server using multi-core, and the key is working independently on specific service.
Could you explain a bit more for your comment? What should we do or your suggestions things.
Thanks-

kakadais on 13 Nov 2020

Forking should be enough for a start.

Well depending on your usage and the implementation the workers dont need to be built with an http server, right now forking over a meteor process starts an http server each time (because meteor starts an http server).
With my implementation I don't need the children to communicate together (so its basically a waste of ressources).
that's what I ment by "serverless process".

Concerning the how we should build multi-core, I think the best way to go is by implementing a worker pool :

there should be a mongoDB backed-up task queue with an associated task map to keep track of the jobs.
there should be at least 5 mandatory fields in the task model : taskType: String, data: Object, priority: Number, onGoing: Boolean, createdAt: Date.
taskType, priority, onGoing, and createdAt should be indexed.
the job Id should be settable by the user.
a job should be removable / updatable only if onGoing is false.
the master should regularly checks for jobs and if there are jobs in the queue and some workers are available it starts forking.

Then you can have 2 types of routine :

either :

the worker should start by pulling a job with onGoing: false sorted by priority and createdAt.
the worker starts the task defined at TaskMap[taskType].
to avoid conflicts between workers, the job should be pulled with findOneAndUpdate to set onGoing: true.
once a worker is done it removes the task from the queue and calls process.exit(0).

Or :

the master keeps tracks of its worker statuses : WORKING || IDLE.
for each IDLE worker, it pulls a task with onGoing: false sorted by priority and createdAt, updates it with onGoing: true, sends it to the worker, sets its status to WORKING.
the worker receives the task and starts it atTaskMap[taskType].
once a worker is done it sends a message to the master saying that its work is done.
the master receives the message and removes the task from the queue.
if there's no jobs left for this worker, it closes the worker, otherwise it sets its status to IDLE.

The first is faster to implement but because of meteor heavy startup routine it will be slower and it will take more ressources.

nathanschwarz on 15 Nov 2020

@kakadais @filipenevola I made a working Worker Pool package here if you want to look at it.
I can eventually put a PR together if you want.
I still think the package needs some tweeks for modularity, logs, and tests anyway.

edit

you can now directly add the package from atmosphere: meteor add nschwarz:cluster

nathanschwarz on 19 Nov 2020

Hi @nathanschwarz, sorry for the delay, I was on vacation, I just read your code and it looks great.

What do you mean by a PR? Do you need any changes in the core?

Are you using your package in production already?

We could promote it in the Meteor community and start to have usage in production if that is not the case yet.

filipenevola on 9 Dec 2020

@filipenevola obviously for me the best place to start would be on the server.

I was thinking about Meteor core features and not between server and client.

What features of Meteor core runtime or builder could benefit most for Multi core support?

For example, I was talking with @renanccastro that maybe we could take advantage of that on tree-shaking build analysis to analyze the sub-trees in different cores.

filipenevola on 9 Dec 2020

@filipenevola, no worries about the delay !

Yes I'm using it in production, it's fully working, it still lacks a few minor tweeks :

the possibility to modify the logging behavior
some basic tests
event listeners (when a task is done / an error occures)
tasks bound to a specific date.

but we could had these incrementally.

No, there's no change to add to the Core as it is now.
I was talking about a minor change to the starting behavior on the core:

The main downside right now is that we can't start Meteor as a serverless process.

since this package doesn't requires the workers to communicate together we could pass a flag to skip the http server to avoid the waste of ressources (It's a few lines of code in the core, but it's not that important).

I can make an abstraction of the worker pool if you wish for the tree-shaking feature since it's bound to mongoDB via the TaskQueue right now.

nathanschwarz on 9 Dec 2020

@nathanschwarz great. Yes, these are nice to have but I believe your package is complete enough already. How could we promote it? A blog post in our official blog?

since this package doesn't requires the workers to communicate together we could pass a flag to skip the http server to avoid the waste of resources

I'm ok having this flag to avoid http server starting up, feel free to start a PR.

I can make an abstraction of the worker pool if you wish for the tree-shaking feature since it's bound to mongoDB via the TaskQueue right now.

I believe this is a good idea, we could have a mode option in your package. In-memory and persistent (your actual version with MongoDB). Is that your idea?

filipenevola on 11 Dec 2020

@filipenevola great !

A blog post would be nice 👍 .

I'm ok having this flag to avoid http server starting up, feel free to start a PR.

I'll work on a PR soon, I will tag you when it's done.

we could have a mode option in your package

I was thinking adding an optional inMemory: Boolean field to the TaskQueue.addTask prototype (defaulted to false).
This way you can have both persistant and in-memory jobs with the same Cluster instance !
I'll work on it ASAP.

nathanschwarz on 11 Dec 2020

@filipenevola I've just updated nschwarz:cluster.

The in-memory jobs are working, and an inMemoryOnly field is settable in the Cluster options.
It should do the trick !
Tell me if you encounter any issue.

update 1.1.0 is now available

nathanschwarz on 12 Dec 2020

@filipenevola, my bad no PR required.

I thought that the cluster module would provide a socket / filestream between the Master and the children for the IPC, but it's not the case, so the http server is still needed.

I've added eventListeners in 1.1.0, so you can handle the results in the master process if needed.

I'll try to find a solution to get a random free port number for the IPC to avoid potential conflicts between multiple apps running / building at the same time.

nathanschwarz on 14 Dec 2020

Shared Nothing Architecture at OS level if we are considering Process instance instead of Worker Thread.

Scaling computation across CPU cores - Pin individual nodejs process launched through cluster module to individual CPU cores. It greatly reduces L1 an L2 cache misses and cost incurred at OS level due to multiprocessing and fair scheduling as any process when launched in LINUX is allocated all the CPU cores by default. Modern Xeon And AMD EPYC server CPU have great single threaded performance and large CPU die cache size. https://gist.github.com/corporatepiyush/55ecca29999cebbad2d58880cd376c90
Scaling Network IO across CPU cores - Cluster module (or sticky-session module) provides a way to pin or stick a socket session for upstream and downstream to a particular nodejs process. https://stackoverflow.com/a/51418575/3282642
cache memory redundancy vs centralized cache server trade-off - If you do above 2 things then it will definitely outperform Worker Thread Implementation with SharedBuffer if for a particular application Cache Hit rate is more client tuned than application tuned. What I mean this is that you are using cache to store more client session oriented data which avoid the trip to DB instead of saving whole lot of other things in cache for which you can't forsee good cache-hit rate. Of course there might be some sort of common data which is worth caching in all nodejs processes, now in this case we have to work out cache memory redundancy vs centralized cache server trade-off. Either replicate all across the nodejs OS level process if the amount is low say for example less than 64MB (totally depends upon your application and latency you want to achieve) or if its greater then may be consider any in-memory key-value based server for central storage like redis or memcache.
Failover and GC - Individual NodeJS OS process also guarantees that if any exception happens in NodeJS process and thus leading to crash either due to external intrusion or a mistake done by programmer does not effectively takes multicore machine node out of the picture, individually affected and crashed processes can be restarted easily. Also each process will be sharing the RAM and hence GC activity will be more effective and lightweight than handling larger heap sizes at once if you are targeting to deploy on beefy spec machine in the cloud.

corporatepiyush on 16 Dec 2020

👀1

Hi @nathanschwarz, this is really great. Thank you.

Please ping me on Community Slack so we can work together in the blog post.

filipenevola on 20 Jan 2021

Was this page helpful?

0 / 5 - 0 ratings