The timeout property on many of the vm module's functions isn't completely foolproof. In the following example, some sandboxed code executed by vm.runInNewContext with a timeout of 5ms schedules an infinite loop to run after a promise resolves, and then synchronously executes an infinite loop. The synchronous infinite loop is killed by the timeout, but then the scheduled loop fires and never ends.
vm.runInNewContext(
'Promise.resolve().then(()=>{while(1)console.log("foo", Date.now());}); while(1)console.log(Date.now())',
{console:{log(){console.log.apply(console,arguments);}}},
{timeout:5}
);
Some output:
...
1442966735710
1442966735710
1442966735710
1442966735710
1442966735710
Error: Script execution timed out.
at Error (native)
at ContextifyScript.Script.runInNewContext (vm.js:18:15)
at Object.exports.runInNewContext (vm.js:49:17)
at repl:1:4
at REPLServer.defaultEval (repl.js:164:27)
at bound (domain.js:250:14)
at REPLServer.runBound [as eval] (domain.js:263:12)
at REPLServer.<anonymous> (repl.js:392:12)
at emitOne (events.js:82:20)
at REPLServer.emit (events.js:169:7)
> foo 1442966735715
foo 1442966735716
foo 1442966735716
foo 1442966735716
foo 1442966735716
... [continues forever]
I'm not really sure what a good solution for this would be, if it's possible, and whether the vm module intends to stand up to this sort of thing. At the very least I think vm's docs should have a note that the timeout property works by best-effort or something and isn't completely foolproof, so no one thinks it's safe to try to use it to sandbox user code in a guaranteed timely manner.
I added the original vm timeout capability. This gets pretty nasty.. the watchdog timer can only cover the execution time of the actual script passed in. If that script, during its execution, interacts with the Node/uv event queue and queues up more asynchronous callbacks (process.nextTick, setTimeout, setInterval) then those will execute later and the watchdog object will have already been destructed if the script executed in less time than the timeout.
One possible way this could be implemented is if the libuv could support something like mentioned in the comments interposed into the actual vm code:
if (timeout != -1) {
// query and save uv main loop reference count
Watchdog wd(env, timeout);
result = script->Run();
// keep spinning uv event loop until reference count equals saved value (soft blocking)
} else {
If the main loop ref count is equal to begin with, it will not block, the watchdog will destruct and everything will be fine. If the ref count is not equal, then that should indicate outstanding events have a ref on the loop and blocking should allow the watchdog to work as intended.
A side effect from this change would be that code setting up an indefinite listener would always hit the timeout. You would only be able to execute code that completely and fully finishes executing, including any interaction(s) with the event loop.
@bnoordhuis can you comment on the possibility of doing something like what I mentioned above in uv? I'm just throwing out that idea, but there may be other reasons why that is not possible. The only other way I could think of is very nasty and involves injecting wrappers for any functions that can add to the event queue and force a check against some pre-computed timestamp.
We can probably handle this by triggering microtask queue flush after vm invocation, though it uses C++ APIs so I'm not sure it can be interrupted properly.
All contexts share the same microtask queue (it's a per-isolate property) so I don't think this can be easily fixed, the promise callbacks can always enqueue new callbacks, ad infinitum.
@bnoordhuis right, there is now way to flush only microtask added inside of the new context. didn't think of that.
@bnoordhuis what I was thinking was that outstanding callbacks would ref the uv event loop, no? So if a way was added to block until ref count reached a desired value, then it would allow the main thread to block, thus allowing the watchdog to do its thing. If the code kept enqueuing callbacks and the count never reached the initial value then it would rightfully timeout, but behavior outside of the vm timeout case would not be affected in any way.
I don't think that could work. Contexts share the event loop. You could wait1 until the event loop's reference count drops below a certain threshold but that doesn't tell you to what contexts events were delivered, unless you add bookkeeping to every call into the VM.
I don't think we want to go down that road and I'm not sure if it would even be enough. Microtasks, for example, are mostly outside of our control - they're driven by V8 - so there would still be loopholes.
1. In reality you can't right now because uv_run() is not re-entrant.
Closing, there's nothing that can be done about it
If there is no reasonable way to fix this, then the documentation should have a notice about timeout not beeing reliable in some cases (and about the shared microtask queue, perhaps).
@ChALkeR oh, ok. saying that it's not reliable is not correct though, we should try to explain what it does and does not
@nodejs/documentation
we will put it on the docs todo list :page_facing_up:
We need to fix this. cc @chrisdickinson
We can wrap .then calls for example which would still be than doing nothing.
doesnt even need a while loop. throwing an exception in .then with no catch is enough.
Wouldn't this work?
1) Flush microtasks
2) Set up watchdogs
3) Run user script
4) Run microtasks again
5) Disable watchdogs
RunMicrotasks() will run until the job queue is 0, even if more jobs are scheduled or TerminateExecution() is called. In the TerminateExecution() case the jobs will just be canceled.
Obviously if you had any resolved promises their callbacks will be triggered by the vm library, I'm not sure how bad of a side effect that would be considered.
This seems to fix timeout for runaway promises:
diff --git a/src/node_contextify.cc b/src/node_contextify.cc
index ff66ffdaaa..c19d8e242e 100644
--- a/src/node_contextify.cc
+++ b/src/node_contextify.cc
@@ -857,22 +857,27 @@ class ContextifyScript : public BaseObject {
Local<Value> result;
bool timed_out = false;
bool received_signal = false;
+ env->isolate()->RunMicrotasks();
if (break_on_sigint && timeout != -1) {
Watchdog wd(env->isolate(), timeout);
SigintWatchdog swd(env->isolate());
result = script->Run();
+ env->isolate()->RunMicrotasks();
timed_out = wd.HasTimedOut();
received_signal = swd.HasReceivedSignal();
} else if (break_on_sigint) {
SigintWatchdog swd(env->isolate());
result = script->Run();
+ env->isolate()->RunMicrotasks();
received_signal = swd.HasReceivedSignal();
} else if (timeout != -1) {
Watchdog wd(env->isolate(), timeout);
result = script->Run();
+ env->isolate()->RunMicrotasks();
timed_out = wd.HasTimedOut();
} else {
result = script->Run();
+ env->isolate()->RunMicrotasks();
}
if (try_catch->HasCaught()) {
@@ -896,6 +901,16 @@ class ContextifyScript : public BaseObject {
try_catch->ReThrow();
return false;
+ } else if (timed_out || received_signal) {
+ // `isolate->IsExecutionTerminating()` returns false which I suspect is a
+ // bug in v8, but `CancelTerminateExecution()` still works
+ env->isolate()->CancelTerminateExecution();
+ if (timed_out) {
+ env->ThrowError("Script execution timed out.");
+ } else {
+ env->ThrowError("Script execution interrupted.");
+ }
+ return false;
}
if (result.IsEmpty()) {
@laverdet I don’t think we can just run the microtask queue before entering the VM script – Promise handlers need to run without any other visible JS stack below them.
Promise handlers need to run without any other visible JS stack below them.
I don't think so. We run them from JS now
@vkurchatkin That’s Node code though. (To phrase it better: without any userland stack below them.)
To phrase it better: without any userland stack below them.)
Oh, makes sense. Step 1 is very problematic indeed.
@addaleax
@laverdet I don’t think we can just run the microtask queue before entering the VM script –
I think vm methods would need to be executed asynchronously (after promises - before I/O) in order to support this sort of approach.
It sounds like too big of a compromise.
Should this stay open? Is it just a documentation issue?
Would it be possible to process the tick queue in C++, in the microtask handler? That way we can avoid associating microtask execution to a specific context… right?
… and execute members of the microtask queue in the context that specific task was created in, which the watchdog timer should mark as disabled.
Never mind. I'm not sure what I was thinking. I guess the best we can do is adding a note to the docs :(
We might want to revisit this now - it seems like this might be possible with async_hooks promises support. (Just don't run any microtasks from that context when the next tick queue is flushed).
Maybe I'm misunderstanding though - @addaleax can you weigh in?
@benjamingr I don’t think we can use async_hooks for any cool stuff here – we still can only either run all or none of microtask queue.
On the other hand, maybe somebody of us has the time to get their hands dirty and implement support in V8; what I imagine is that we could add a flag contexts that mark them as abandoned/unable to run microtasks, then when the queue is being drained just ignore tasks whose contexts are abandoned.
What I wrote doesn’t make sense because we don’t associate timeouts with contexts, we associate them with calls to runInContext.
@addaleax you're right, I don't see any way to do this without subclassing Promise and at that point you wouldn't need async_hooks anyway.
I think at this point the best thing to do might be to ask @gsathya whether it would be okay to extend PromiseHooks so that there’s a way to say “please don’t actually run this” in the kBefore hook.
I can't reproduce this in node 6.11.2 (at least on Android). Was it fixed?
Closing this since it's marked as won't fix. Feel free to reopen if needed.
Current behavior isn't reasonable @fhinkel - this needs to be fixed or for the very least documented.
I agree, this should probably be open as a doc issue
Also I think it’s not completely out of the question (at least for me) to add V8 support for this?
@addaleax what do you think about taking my fix from earlier (here) and putting it behind an option. I think that would make everyone happy. The timeout option is close to useless without control over microtasks.
Also while I'm here, if you're on this ticket you may be interested in this library I built: isolated-vm. It exposes v8 isolates directly to nodejs which comes with a lot of benefits like enforceable memory limits, enforceable timeouts, and parallelization (don't block your nodejs loop while running untrusted code).
what do you think about taking my fix from earlier (here) and putting it behind an option
I am still worried that this doesn’t quite fit the model of how promises are supposed to work, and since all promises share a microtask queue, introducing microtask queue runs into code that’s not top-level seems like it might lead to hard-to-debug timing issues without an obvious source…
Also while I'm here, if you're on this ticket you may be interested in this library I built: isolated-vm.
Oh, yes, that is very relevant to my interests – I’m doing work on Workers in a fork of Node, so this definitely contains interesting parts to look at. :) (Also, yes, code isolation is part of the goals there as well.)
@laverdet For example, while taking another look at adding support for this in V8, I learned that you can’t kick off microtask queue runs during another microtask queue run – which makes sense I guess, but it means your approach would sometimes not work without an obvious reason?
Fair points. Another option would be to make a new method which returns a promise instead. It would fire off a uv_async_send and from that callback we could be sure that there are no microtasks in the queue (and if there were microtasks for some reason it would not violate any invariants to run them). You could then run the usual vm stuff and microtasks would automatically be included in the result = script->Run() call.
Or maybe the new function should take a callback instead. I'm not familiar with node's plans going forward regarding callbacks vs promises.
The only gripe is that the vm library is building up a lot of cruft and this would be yet another runIn... function.
I think at this point the best thing to do might be to ask @gsathya whether it would be okay to extend PromiseHooks so that there’s a way to say “please don’t actually run this” in the kBefore hook.
I'm trying to understand why this needs to be a part of PromiseHooks? It sounds like a boolean on the isolate should be enough?
@laverdet isolated-vm looks exactly like what I've been dreaming about for the last few years! I've been dying for a good in-process way to sandbox untrusted javascript with time and memory limits like it allows, and to be able to safely dispose of the entire sandbox when I want. Thanks!
Node's vm module only did a fraction of what I wanted. I mostly opened up this issue up hoping for an update to the docs to warn others who might've thought to try to use the vm module for some degree of sandboxing. Especially now that isolated-vm exists, personally I'm not too interested in having robust timeouts for the vm module.
@gsathya The issue here is only about Promises that have been created in a specific v8::Context, during a specific synchronous time frame and their (asynchronous) descendents; I think that’s pretty hard for the VM to follow on its own?
Sorry for closing this prematurely :disappointed:
sandbox untrusted javascript with time and memory limits
@AgentME Be aware that having memory limits doesn't mean V8 won't abort on out-of-memory conditions. You should still spin up a separate process.
So I'm wondering, what is a good solution if you want to execute code and share things between your code and the provided code? Is there a robust and manageable solution in node? If I understand well vm is not reliable in fine.
@devsnek @addaleax ... just a quick heads up.. I'm working on a couple of known_issue tests for this particular issue and came up with the following example:
'use strict'
require('../common');
const assert = require('assert');
const vm = require('vm');
const NS_PER_MS = 1000000n;
const hrtime = process.hrtime.bigint;
function loop() {
const start = hrtime();
while (1) {
const current = hrtime();
if ((current - start) / NS_PER_MS >= 6n) {
throw new Error(
`escaped timeout at ${(current - start) / NS_PER_MS} milliseconds!`);
}
};
}
assert.throws(() => {
vm.runInNewContext(
'queueMicrotask(loop); loop();',
{
hrtime,
queueMicrotask,
loop
},
{ timeout: 5 }
);
}, {
code: 'ERR_SCRIPT_EXECUTION_TIMEOUT',
message: 'Script execution timed out after 5ms'
});
Obviously, the new queueMicrotask() function exhibits the same behavior reported in this issue but that's not the interesting bit... the interesting bit is that the error thrown from inside the loop() function executed by the queueMicrotask() is completely swallowed.
@jasnell i assume that's just the behaviour of whatever the error event of process is doing right?
@devsnek ... sorry, I don't understand your question. The error does not appear to be emitted by anything.
@jasnell yeah i guess it is weirdly swallowed. what should it be doing that isn't process.emit('error', e)?
edit: so it looks like i made a miscalculation. emitting an error just throws, but its still inside the microtask. this leads to what i would consider a bug in v8, where uncaught exceptions in microtasks don't trigger our fatal exception handler. i've opened a bug report.
edit: so it looks like i made a miscalculation. emitting an error just throws, but its still inside the microtask. this leads to what i would consider a bug in v8, where uncaught exceptions in microtasks don't trigger our fatal exception handler. i've opened a bug report.
@devsnek Can you add a link to the bug report?
edit: so it looks like i made a miscalculation. emitting an error just throws, but its still inside the microtask. this leads to what i would consider a bug in v8, where uncaught exceptions in microtasks don't trigger our fatal exception handler. i've opened a bug report.
@devsnek Can you add a link to the bug report?
Is this the right bug report? https://bugs.chromium.org/p/v8/issues/detail?id=8465
@trott maybe? I also opened https://crbug.com/v8/8326 a bit ago.
the feature was shimmed in https://github.com/nodejs/node/commit/2caf0793c1e60b37851753d2eb219034be57da5f
the feature was shimmed in https://github.com/nodejs/node/commit/2caf0793c1e60b37851753d2eb219034be57da5f
Does that mean it may now be possible to fix things so that https://github.com/nodejs/node/blob/27dc74fdd0950d39d175145400a70174244870d9/test/known_issues/test-vm-timeout-escape-queuemicrotask.js is no longer a known_issue test and passes?
@Trott it will still fail stopping at 5ms, but it should throw at 100ms instead of silently stopping.
The issue itself is a wontfix as noted before and was documented by 5e5a9455f8c6e671be6c6d1e29291fe3cbf61f30.
Feel free to reopen if there is anything else to do here.
Fwiw, I’m working on a fix for this, for the Promise case, so I’ll reopen this.
(The queueMicrotask() and nextTick() examples are not really fixable, but I wouldn’t consider them bugs or actual issues either, just misunderstandings about what those functions do.)