I'm still trying to work out a proper test case, but what happens is basically the following:
I have a whole bunch of plugins that hook into lifecycle hooks.
onRequest hook, everything works as expected and I get the 500 response.onPreHandler hook, the error is swallowed and the request keeps hanging until it times out.This clearly sounds like a domains problem, especially since I also never receive a uncaughtException error.
My investigation seems to point to my authenticator function which makes use of a promise and finally calls the reply.continue function as the point where things first go wrong, although I haven't been able to clearly track it down.
I've tried manually binding some of the callbacks to request._protect.domain but I didn't have much success.
In the end that means that if a developer isn't careful once with manually managing try/catch etc, loads of requests will hang without any indication on that there was an error happening.
Seems related to https://github.com/hapijs/hapi/issues/2771
Yes, this is a real pain to deal with, especially while using promises. And that's because domains are hell :) plus different promise implementations bury errors in all sorts of bizarre ways, sometimes even combating domains. I personally think any progress forward on this front would be great as long as it doesn't terribly complicate hapi core– some small reproduction cases would be useful (even just a gist).
By the way, the request object gives you access to the domain as part of its public api (request.domain)!
Might be worth mentioning that I also don't receive any uncaughtException errors when I set useDomains to false, which confuses me a bit. Whatever I do, errors in handlers and some hooks get swallowed and the request just times out.
I think I found out what the issue is. In a promise chain, everything that happens synchronously is still part of the chain and will have the promise implementation wrap the function with try/catch.
So if you ever synchronously call reply, reply.continue or others that continue the request lifecycle from inside of a promise chain, errors inside of the entire synchronous lifecycle are now handled by this promise's catch handler.
Basically, whenever you need to call any of those from inside of a promise, you will need to process.nextTick the invocation so that error handling can continue as intended.
So the biggest strength of promises turns out to be one of the major weaknesses if you need to interact with non promise-aware code that relies on errors being caught in domains. Since there is no explicit catch, a completely unrelated promise chain will happily catch all of those errors for you.
I feel like many of these problems (I just ran into a similar trap again within a try/catch) could be avoided if reply.continue, reply etc would always defer execution until next tick.
Currently it looks like the extension and handler interfaces are all async, except that they aren't.
In general the rule should be, if something could ever be async, it should always be handled async and never mix sync and async modes.
+1 :-/
Just tracked down our hanging requests and seems to be the same issue. Our authentication validation function uses bookshelf and from tracking it down any handler that throws an exception gets passed back to the cb. If I remove the catch it gets sent to process#unhandledRejection event. Another thing I have notice is that if I remove the promise from the chain the process gets an unhandled exception and quits. If I disable authentication it gets handled as expected and returns a 500 response
new User({ id : 1 })
.fetch({
require : true)
.then((user) => {
cb(null, true, user);
})
.catch(cb);
Is there any workaround even from the callback to determine it wasn't an authentication error and kill the request? Instead of having numerous open hanging requests
What worked for me was doing process.nextTick(() => cb(null, true, user)).
That way I'd at least get unhandled exception errors if something goes wrong.
@TimBeyer thank you worked for me as well
yeah.. the promise integration in hapi still isn't perfect, and domains have always been annoying. personally i've been trying to get in the habit of always having a .catch() on every promise i use outside of a handler (with handlers i just pass the promise to reply and let it deal with the catch)
you might try watching the 'unhandledRejection' event rather than 'unhandledException' also
I've just run into this as well. A part of my authentication flow uses a library which returns promises. In the case where authentication goes well, if there is an unhandled exception inside of any route implementation, the promise library ends up catching it.
This wouldn't be so bad if I could then handle said exception. Something like auth.then().catch(err=>Boom("Unknown problem on server", err). However, because the auth plugin has already called reply.continue once, no other calls to reply get honored because of it's protective once wrapping.
@TimBeyer, thanks that workaround is what I've got in place until this gets fixed.
@nlf, it sounds like you've been able to successfully regain control using .catch, any tips?
I created this repo: https://github.com/wwalser/hapi-promise-plugin-problem that demonstrates the problem. I'll whittle the code down the only important bits and paste here as well.
//Troublesome plugin
const troublesomePlugin = (server, options, next) => {
server.ext('onPreHandler', (request, reply) => {
Q.delay(100)
.then(() => reply.continue())
.catch(err => {
console.log("Yeah, we've got Trouble!", err);
//Because we've already called reply.continue, this doesn't work.
// Hapi code base ./lib/reply.js L74
reply(Boom.badImplementation("Explicit handling doesn't fix the issue.", err));
});
});
next();
};
troublesomePlugin.attributes = {
name: 'trouble'
};
server.register([
troublesomePlugin,
], function() {
server.route({
method: 'GET',
path: '/trouble',
handler: (request, reply) => {
//throw exception which should return internal server error.
const thing = foo.bar.baz.qux;
reply("This route works fine.");
}
});
server.start(()=>{});
});
Given the resolution of #2771, I assume the solution to this problem is to disable domains then wrap all calls to server.register in your own try/catch and handle unexpected exceptions in that.
:point_up: should fix a few of them, lmk if it fixes all. I think it's a developers' mistake but we can't expect people to do a good job at using promises (no offense to you all), so it might as well be part of the framework.
It actually has nothing to do with promises. The same issue can be created just using try/catch. No promise is necessary.
So the most bare bones version of this bug is: "Plugin points that occur on same call stack as route handler can not use try/catch and still gracefully recover from unexpected exceptions."
Code: https://github.com/wwalser/hapi-promise-plugin-problem/blob/withoutPromise/server.js
//Troublesome plugin
const troublesomePlugin = (server, options, next) => {
server.ext('onPreHandler', (request, reply) => {
try {
//Do something that benefits from a try catch. JSON parsing for example.
reply.continue();
} catch (err) {
//Again, manually handling problem doesn't help.
reply(Boom.badImplementation("Explicit handling doesn't fix the issue.", err));
}
});
next();
};
troublesomePlugin.attributes = {
name: 'trouble'
};
server.register([
troublesomePlugin,
], function() {
server.route({
method: 'GET',
path: '/trouble',
handler: (request, reply) => {
//throw exception which should return internal server error.
const thing = foo.bar.baz.qux;
reply("This route works fine.");
}
});
server.start((err)=>{});
});
Which is also fixed by my PR, or at least should be.
Yes. Thank you for the fix. I just thought the try example may be of interest to the thread.
Appreciate all you do :).
The only difference is that while with a regular try/catch you can move the reply.continue() out of the try block, it's impossible to do so with a promise. Once the async chain has started, the continue needs to happen inside of a then callback, which then automatically takes over the error handling for any future errors that are thrown. So really there is no "doing a good job at using promises" here. It's an issue about making a callback based extensions API look async while calling the handlers in a synchronous fashion. But that is a common developers' mistake.
I do consider that it's your responsibility as a developer to know when to break the promise chain by doing what I did in the PR, but nobody does that with promises, it is admittedly ugly to write every time.
I am not planning on dealing with domain and promises issues. If others want to spend the time and come up with PRs to address specific test cases, I'll review.
Most helpful comment
What worked for me was doing
process.nextTick(() => cb(null, true, user)).That way I'd at least get unhandled exception errors if something goes wrong.