Node: console.log failure while working with worker threads.

Created on 15 Nov 2019  Â·  23Comments  Â·  Source: nodejs/node

  • Version: 12.4.0
  • Platform: Docker which runs on mac (Darwin Kernel Version 18.7.0, xnu-4903.278.12~1/RELEASE_X86_64 x86_64)
  • Subsystem:

Hello everyone.

Thank you for your hard work on NodeJS.

Now, i developed a small service, which takes 4 csv files, parses them, maps them together and imports them into elasticsearch.

Each file is being parsed on a different thread.

The parsed content of one of the files is being send via an Event to a different file, this file spawns for each set of data, a new thread that will import that set into ES.

In parallel on the main thread, i send the content of one of the files in chunks via an Event with the contents of the remaining 2 files, to a different script again.

Which will spawn a new thread for that chunk of data. That thread will map the given data to the chunk provided, if they match. Send the mapped data back to the main thread, which again will spawn a new thread who will import the mapped data into ES.

The issue i have here is, that once everything is working at the same time, the only console.logs i get are the ones from the main thread. Everything that is being logged on a worker thread, is being lost somewhere, while the main thread is under load.

Note: The actual code is being processed as it should, it is just the console.logs who do not care.

This makes debugging on worker threads really difficult. Maybe i am missing something.

console worker

Most helpful comment

Sorry it took so long to get around to this repro. Life's been busy...

consoleissue.js:

const path = require('path');
const wt = require('worker_threads');

const worker = new wt.Worker(path.resolve(path.join(__dirname, 'consoleissue-worker.js')));
worker.on('message', msg => {
    console.log('worker response: ' + JSON.stringify(msg));
});

console.log('posting');
worker.postMessage({ dostuff: true });
console.log('posted');

consoleissue-worker.js

const wt = require('worker_threads');

wt.parentPort.on('message', (msg) => {
    console.log('starting');
    let total = 0;
    for (let i=0; i<50000000; i++) {
        total += Math.sin(i);
        if ((i % 5000000) === 0) {
            console.log(i);
            wt.parentPort.postMessage({ msg: i.toString() });
        }
    }
    console.log('finished - ' + total);
    wt.parentPort.postMessage({ done: true });
});

Run node consoleissue.js and you shall see:

posted
starting
worker response: {"msg":"0"}
worker response: {"msg":"5000000"}
worker response: {"msg":"10000000"}
worker response: {"msg":"15000000"}
worker response: {"msg":"20000000"}
worker response: {"msg":"25000000"}
worker response: {"msg":"30000000"}
worker response: {"msg":"35000000"}
worker response: {"msg":"40000000"}
worker response: {"msg":"45000000"}
worker response: {"done":true}
0
5000000
10000000
15000000
20000000
25000000
30000000
35000000
40000000
45000000
finished - -0.013948591772423392

If the console log worked as theoretically stated in here, the worker responses should be right alongside the normal console.logs of the numbers, but instead the console.logs from the worker thread all show up at the same time when the worker thread returns control to the event loop.

All 23 comments

As a temporary solution, send logs to the main and log them there.

As a temporary solution, send logs to the main and log them there.

Thanks for answering.
Yes that's what i intended to do for the QA version.

As a temporary solution, send logs to the main and log them there.

Fwiw, this is how console.log(), and more generally process.stdout and process.stderr, in Workers currently work – they send their data for the main thread and wait for that to process it.

The issue happens because when the main thread is busy, then it also won’t process such data coming from the Workers.

As a temporary solution, send logs to the main and log them there.

Fwiw, this is how console.log(), and more generally process.stdout and process.stderr, in Workers currently work – they send their data for the main thread and wait for that to process it.

The issue happens because when the main thread is busy, then it also won’t process such data coming from the Workers.

I can confirm that. Sending the logs to the main thread does nothing while the main thread is busy.
The only thing that worked for me, was off loading more work from the main thread to the worker threads / event based scripts and even that is tricky to do.

The only thing that worked for me, was off loading more work from the main thread to the worker threads / event based scripts

That sounds like a good idea. I usually try not to block the main thread at all in those scenarios, to make sure it remains responsive.

In theory, we could allow threads to do I/O directly, but it will be tricky and might lead to unwanted side effects.

how about new Worker(x, { synchronizedStdio: false })?

@devsnek What would that do and how would it work?

@addaleax instead of sending internal messages to the main thread, it would just directly write to stdout/err. messages might be garbled on occasion, but sometimes that isn't a problem.

@devsnek Libuv won’t let you create multiple handles for the same fd, plus writing synchronously would be problematic for the same reasons for which it is on the main thread (and fs.writeSync() for writing data fails on Windows for non-ASCII characters).

hmm, aren't they separate event loops?

They are separete event loops – see https://github.com/nodejs/node/issues/30507 for example crashes when sharing fds between threads.

but it isn't inherently impossible... you just have to make sure you aren't renumbering, closing, etc.

@devsnek You have to close the handle when the thread stops :upside_down_face: But feel free to work on something if you have ideas for how to approach this, or more generally the issues in #30507.

I have a question, i don't know a lot when it comes to the inner workings of nodejs.

But couldn't a console.log from a worker thread be pushed into the event loop of the main thread?

That way it should be processed at some point since its in the Q, even if it's late or at the end, still better than not getting them at all.

instead of sending internal messages to the main thread, it would just directly write to stdout/err. messages might be garbled on occasion, but sometimes that isn't a problem.

If we really want to do that, it might make more sense to have some mutex/locking mechanism to prevent concurrent writes. That would ensure that ouput from console.log is never "garbled".

But couldn't a console.log from a worker thread be pushed into the event loop of the main thread?

I think that is the current behavior, isn't it?

mutex is an interesting idea. I guess we'd have to specialise it to our stdio handles?

But couldn't a console.log from a worker thread be pushed into the event loop of the main thread?

I think that is the current behavior, isn't it?

I don't know, that's why i'm asking. But if it is the current behavior, than there is an issue no? If it gets pushed into the event loop, than it has to be processed at some point. It can't be freed from the event loop without execution else this would mean the event loop is totally unreliable.

mutex is an interesting idea. I guess we'd have to specialise it to our stdio handles?

Yes, but that is assuming that we would prefer allowing worker threads to use I/O directly instead of the current behavior. I hope that @addaleax can shine some light on this.

If it gets pushed into the event loop, than it has to be processed at some point

It depends on what the main thread is doing, if it is busy running synchronous JavaScript code, then it will never have a chance to process messages. If you have some code for us to reproduce the problem, that would be helpful.

@tniessen I’m not entirely sure how the mutex approach would work… it seems like that’s something that might still require moving stdio to a separate thread if we really want it to always be available, and then post messages to that thread (which can be just a C++ thread, not a Worker)?

If it's just about writing to stdout/stderr we could just write to the file descriptors "directly" (unless libuv prevents that?) and have one mutex per file descriptor, which prevents two threads from writing at the same time (mostly to prevent console.log outputs to interfere with each other).

On the other hand, I think that the current behavior is reasonable, and it is a fair restriction to only write from the main thread. I guess worker threads shouldn't really use console.log in production anyway.

FWIW, I don't believe this is actually the behavior being used, at least not as of 12.9.0 (on MacOS). I just ran a test app that ran an interval on the main thread to output a console.log, and then put console.log lines on the onmessage handler inside the worker thread from the parentport, and on the onmessge handler in the main thread from the worker thread, and then some console.logs inside the worker thread as it does some work, and reports progress out via the port along the way.

The interval regularly reported (expected -- basically nothing's going on on the main thread here).
Anything that was reported over the port was console.logged immediately on the main thread as well. However, both the request from the worker thread and all progress called in console.log on the worker thread wasn't shown in the console until the worker thread entirely finished all work and relinquished control.

If it was just a matter of keeping the main thread unblocked, I'd totally understand that limitation in behavior, but that doesn't appear to be the case at all here -- you need both the worker thread message pump to be free before it can send the console messages back to the main thread, who then ALSO needs its message pump to be free before it can process them. So, I'm going to end up having to basically trap console.log on my worker thread and pipe it back over the message port to the parent thread if I want real-time console reporting of progress (while debugging, etc.)

@deregtd I can’t reproduce that behaviour based on your description – would you mind sharing the code you used?

Sorry it took so long to get around to this repro. Life's been busy...

consoleissue.js:

const path = require('path');
const wt = require('worker_threads');

const worker = new wt.Worker(path.resolve(path.join(__dirname, 'consoleissue-worker.js')));
worker.on('message', msg => {
    console.log('worker response: ' + JSON.stringify(msg));
});

console.log('posting');
worker.postMessage({ dostuff: true });
console.log('posted');

consoleissue-worker.js

const wt = require('worker_threads');

wt.parentPort.on('message', (msg) => {
    console.log('starting');
    let total = 0;
    for (let i=0; i<50000000; i++) {
        total += Math.sin(i);
        if ((i % 5000000) === 0) {
            console.log(i);
            wt.parentPort.postMessage({ msg: i.toString() });
        }
    }
    console.log('finished - ' + total);
    wt.parentPort.postMessage({ done: true });
});

Run node consoleissue.js and you shall see:

posted
starting
worker response: {"msg":"0"}
worker response: {"msg":"5000000"}
worker response: {"msg":"10000000"}
worker response: {"msg":"15000000"}
worker response: {"msg":"20000000"}
worker response: {"msg":"25000000"}
worker response: {"msg":"30000000"}
worker response: {"msg":"35000000"}
worker response: {"msg":"40000000"}
worker response: {"msg":"45000000"}
worker response: {"done":true}
0
5000000
10000000
15000000
20000000
25000000
30000000
35000000
40000000
45000000
finished - -0.013948591772423392

If the console log worked as theoretically stated in here, the worker responses should be right alongside the normal console.logs of the numbers, but instead the console.logs from the worker thread all show up at the same time when the worker thread returns control to the event loop.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

stevenvachon picture stevenvachon  Â·  3Comments

danielstaleiny picture danielstaleiny  Â·  3Comments

ksushilmaurya picture ksushilmaurya  Â·  3Comments

dfahlander picture dfahlander  Â·  3Comments

seishun picture seishun  Â·  3Comments