Kibana version: 7.8.1
Elasticsearch version: 7.8.1
Server OS version: Ubuntu 16.04.6 LTS
Browser version: Firefox 79.0b9
Browser OS version: macOS Catalina 10.15.6
Original install method (e.g. download page, yum, from source, etc.): apt
Describe the bug: Some responses from /internal/search/es
respond with empty status code, header, and body. Presumably from cancelled queries.
Steps to reproduce:
Expected behavior:
Network log should contain proper queries to the ES API with corresponding responses. Cancelled queries should respond with a proper HTTP status code (204 seems appropriate).
Screenshots (if relevant):
Errors in browser console (if relevant):
Failed to load resource: the server responded with a status of 502 (Proxy Error)
- this only occurs when Kibana is reverse proxied.
Any additional context:
This is primarily an issue when reverse proxying Kibana. Apache fusses when it receives an upstream response that doesn't have a valid header or is missing the response code.
A temporary remediation for this with Apache is setting ProxyBadHeader Ignore
in the server config. Obviously this is not the ideal solution, since the cancelled inflight requests should return a valid HTTP status code despite responding with no content.
Edit: The ProxyBadHeader fix doesn't fully resolve the problem. Not 100% why but I speculate that the upstream server is both sending empty responses and terminating requests early.
Pinging @elastic/kibana-security (Team:Security)
Pinging @elastic/kibana-app-arch (Team:AppArch)
we see a similar problem - i tried the ProxyBadHeader Ignore
suggestion but it didn't help. I also tried to set in the apache's ProxyPass line the addtional parameters retry=1 acquire=3000 timeout=600 KeepAlive=On
It happens for example every few clicks on the "view single document"-button at a message. sometimes it even says that there is no document with that id - a refresh then works. sometimes we also get following error-message "proxy error" with following log-output
_construct@https://test-kibana.REMOVED/32141/bundles/commons.bundle.js:3:466674
Wrapper@https://test-kibana.REMOVED/32141/bundles/commons.bundle.js:3:466064
_createSuper/<@https://test-kibana.REMOVED/32141/bundles/commons.bundle.js:3:464852
HttpFetchError@https://test-kibana.REMOVED/32141/bundles/commons.bundle.js:3:467831
_callee3$@https://test-kibana.REMOVED/32141/bundles/commons.bundle.js:3:1292397
l@https://test-kibana.REMOVED/32141/bundles/kbn-ui-shared-deps/kbn-ui-shared-deps.js:288:969217
s/o._invoke</<@https://test-kibana.REMOVED/32141/bundles/kbn-ui-shared-deps/kbn-ui-shared-deps.js:288:968971
_/</e[t]@https://test-kibana.REMOVED/32141/bundles/kbn-ui-shared-deps/kbn-ui-shared-deps.js:288:969574
asyncGeneratorStep@https://test-kibana.REMOVED/32141/bundles/commons.bundle.js:3:1285920
_next@https://test-kibana.REMOVED/32141/bundles/commons.bundle.js:3:1286249
promise callback*asyncGeneratorStep@https://test-kibana.REMOVED/32141/bundles/commons.bundle.js:3:1286038
_next@https://test-kibana.REMOVED/32141/bundles/commons.bundle.js:3:1286249
promise callback*asyncGeneratorStep@https://test-kibana.REMOVED/32141/bundles/commons.bundle.js:3:1286038
_next@https://test-kibana.REMOVED/32141/bundles/commons.bundle.js:3:1286249
_asyncToGenerator/</<@https://test-kibana.REMOVED/32141/bundles/commons.bundle.js:3:1286386
_asyncToGenerator/<@https://test-kibana.REMOVED/32141/bundles/commons.bundle.js:3:1286143
fetchResponse@https://test-kibana.REMOVED/32141/bundles/commons.bundle.js:3:1292730
_callee$@https://test-kibana.REMOVED/32141/bundles/commons.bundle.js:3:1288895
l@https://test-kibana.REMOVED/32141/bundles/kbn-ui-shared-deps/kbn-ui-shared-deps.js:288:969217
s/o._invoke</<@https://test-kibana.REMOVED/32141/bundles/kbn-ui-shared-deps/kbn-ui-shared-deps.js:288:968971
_/</e[t]@https://test-kibana.REMOVED/32141/bundles/kbn-ui-shared-deps/kbn-ui-shared-deps.js:288:969574
asyncGeneratorStep@https://test-kibana.REMOVED/32141/bundles/commons.bundle.js:3:1285920
_next@https://test-kibana.REMOVED/32141/bundles/commons.bundle.js:3:1286249
promise callback*asyncGeneratorStep@https://test-kibana.REMOVED/32141/bundles/commons.bundle.js:3:1286038
_next@https://test-kibana.REMOVED/32141/bundles/commons.bundle.js:3:1286249
_asyncToGenerator/</<@https://test-kibana.REMOVED/32141/bundles/commons.bundle.js:3:1286386
_asyncToGenerator/<@https://test-kibana.REMOVED/32141/bundles/commons.bundle.js:3:1286143
_callee2$/</<@https://test-kibana.REMOVED/32141/bundles/commons.bundle.js:3:1289593
_callee2$@https://test-kibana.REMOVED/32141/bundles/commons.bundle.js:3:1288382
l@https://test-kibana.REMOVED/32141/bundles/kbn-ui-shared-deps/kbn-ui-shared-deps.js:288:969217
s/o._invoke</<@https://test-kibana.REMOVED/32141/bundles/kbn-ui-shared-deps/kbn-ui-shared-deps.js:288:968971
_/</e[t]@https://test-kibana.REMOVED/32141/bundles/kbn-ui-shared-deps/kbn-ui-shared-deps.js:288:969574
asyncGeneratorStep@https://test-kibana.REMOVED/32141/bundles/commons.bundle.js:3:1285920
_next@https://test-kibana.REMOVED/32141/bundles/commons.bundle.js:3:1286249
_asyncToGenerator/</<@https://test-kibana.REMOVED/32141/bundles/commons.bundle.js:3:1286386
_asyncToGenerator/<@https://test-kibana.REMOVED/32141/bundles/commons.bundle.js:3:1286143
Fetch/</<@https://test-kibana.REMOVED/32141/bundles/commons.bundle.js:3:1289716
search@https://test-kibana.REMOVED/32141/bundles/plugin/data/data.plugin.js:6:47756
search/</<@https://test-kibana.REMOVED/32141/bundles/plugin/dataEnhanced/dataEnhanced.plugin.js:1:145473
__kbnSharedDeps__</d</t.prototype._tryNext@https://test-kibana.REMOVED/32141/bundles/kbn-ui-shared-deps/kbn-ui-shared-deps.js:271:391171
__kbnSharedDeps__</d</t.prototype._next@https://test-kibana.REMOVED/32141/bundles/kbn-ui-shared-deps/kbn-ui-shared-deps.js:271:391073
__kbnSharedDeps__</u</t.prototype.next@https://test-kibana.REMOVED/32141/bundles/kbn-ui-shared-deps/kbn-ui-shared-deps.js:21:41156
l@https://test-kibana.REMOVED/32141/bundles/kbn-ui-shared-deps/kbn-ui-shared-deps.js:271:1151083
__kbnSharedDeps__</o</t.prototype._execute@https://test-kibana.REMOVED/32141/bundles/kbn-ui-shared-deps/kbn-ui-shared-deps.js:271:409061
__kbnSharedDeps__</o</t.prototype.execute@https://test-kibana.REMOVED/32141/bundles/kbn-ui-shared-deps/kbn-ui-shared-deps.js:271:408877
__kbnSharedDeps__</o</t.prototype.flush@https://test-kibana.REMOVED/32141/bundles/kbn-ui-shared-deps/kbn-ui-shared-deps.js:271:405111
setInterval handler*__kbnSharedDeps__</o</t.prototype.requestAsyncId@https://test-kibana.REMOVED/32141/bundles/kbn-ui-shared-deps/kbn-ui-shared-deps.js:271:408580
__kbnSharedDeps__</o</t.prototype.schedule@https://test-kibana.REMOVED/32141/bundles/kbn-ui-shared-deps/kbn-ui-shared-deps.js:271:408478
__kbnSharedDeps__</r</e.prototype.schedule@https://test-kibana.REMOVED/32141/bundles/kbn-ui-shared-deps/kbn-ui-shared-deps.js:271:920108
__kbnSharedDeps__</o</t.prototype.schedule@https://test-kibana.REMOVED/32141/bundles/kbn-ui-shared-deps/kbn-ui-shared-deps.js:271:404983
s/<@https://test-kibana.REMOVED/32141/bundles/kbn-ui-shared-deps/kbn-ui-shared-deps.js:271:1150975
the apache access and error logs look like
# access_log
10.115.50.90 - my.user [18/Aug/2020:10:21:31 +0200] "POST /api/index_management/indices/reload HTTP/1.1" 502 341 "https://test-kibana.REMOVED/app/kibana" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:79.0) Gecko/20100101 Firefox/79.0" 689 536
# error_log
[Tue Aug 18 10:21:31.566807 2020] [proxy_http:error] [pid 26954:tid 139893653104384] (104)Connection reset by peer: [client 10.115.50.90:0] AH01102: error reading status line from remote server REPLACED.NAME:5601, referer https://test-kibana.REMOVED/app/kibana
[Tue Aug 18 10:21:31.566936 2020] [proxy:error] [pid 26954:tid 139893653104384] [client 10.115.50.90:0] AH00898: Error reading from remote server returned by /api/index_management/indices/reload, referer https://test-kibana.REMOVED/app/kibana
It also happens connecting directly to Kibana without using the in-between apache reverse-proxy
Happens with latest Firefox 79.0 and Chrome 84.0.4147.125 on windows 10
With ELK 6.8 we did not have those problems.
@BBQigniter - I have been seeing the errors even after adding the ProxyBadHeader setting. I think there are two things happening here - some responses are blank with the missing header (ProxyBadHeader should resolve those), and some responses are being terminated prematurely by Kibana (connection reset by peer) - I think these are giving the remaining errors.
The errors are less common when setting ProxyBadHeader, but it could also be a placebo. This is a pretty glaring issue since errors popping up while using Kibana are not at all ideal.
yes, those errors are super annoying and thankfully we only have those currently on our test-cluster where I'm testing the upgrade procedure.
Just upgraded whole cluster to version 7.9 - same errors :(
@joshporter1 In terms of the search feature, unnecessary requests (for example when running a new search before the previous one had finished) get canceled, from the browser, using an AbortController
.
It is typical for a request canceled in flight, to show an empty response body and no status code (status code 0).
Here is a command line example fetching and cancelling a request to some random online API in Chrome.
You can see that the body is empty and the request is marked ask canceled
:
I was surprised however to find out that Firefox doesn't mark the request as canceled, but it's still the same scenario:
So anyway, the empty responses are part of a normal workflow. I wonder how we can work around this in a reverse proxy scenario however.
@lizozom I kind of figured that's what was happening, the fact that Firefox wasn't showing it as such is a tad confusing. However, Kibana still shows errors in other browsers such as Chrome and Brave since the reverse proxy server returns 502's on some of the aborted requests.
I'm hoping to find a good solution to solve this at the reverse proxy layer but I've combed through the docs a few times and haven't found anything useful.
Can you share some of the resources you've been looking into? I'm curious to try and do this setup, to see if I can reproduce the issue.
Sure thing, I've mostly been digging through the mod_proxy docs at the options available for the Apache reverse proxy.
The ProxyBadHeader
option is the only one that I found that had any impact, setting it to Ignore
resolved the invalid header error message, but we are still left with the connection reset by peer message.
Like I wrote in my first comment, I saw "proxy errors" and or "no document with that id" also while pointing my browser directly to Kibana without going through the reverse proxy. It only happens not that often and is hardly reproducable.
The next thing I sometimes experience is, that clicking on links etc. sometimes does not work correctly like in this gif.
This also happens with and without reverse proxy usage. If this happens I have to refresh the browser tab in Chrome or Firefox. Kibana 7.9
@BBQigniter can you please explain what the gif should be showing?
@lizozom sure. sry, if it's not clear
I now found this wonderful blog https://newweaver.com/the-mystery-of-the-502-kibana-errors/
This guy describes EXACTLY the same problem!
/usr/share/kibana/src/core/server/http/http_tools.js
function createServer
starting at line 109
function createServer(serverOptions, listenerOptions) {
const server = new _hapi.Server(serverOptions);
server.listener.keepAliveTimeout = listenerOptions.keepaliveTimeout;
/**
* fix proposed from https://newweaver.com/the-mystery-of-the-502-kibana-errors/
*/
server.listener.headersTimeout = listenerOptions.keepaliveTimeout + 1000;
server.listener.setTimeout(listenerOptions.socketTimeout);
server.listener.on('timeout', socket => {
socket.destroy();
});
server.listener.on('clientError', (err, socket) => {
if (socket.writable) {
socket.end(Buffer.from('HTTP/1.1 400 Bad Request\r\n\r\n', 'ascii'));
} else {
socket.destroy(err);
}
});
return server;
}
and restarted Kibana. I now have that "fix" for a few minutes and didn't get any proxy errors for some time. Hopefully that's it.
@BBQigniter thanks for that link - please keep me updated on if it does the trick!
@joshporter1 it seems the fix really helps, no "proxy error" issues yet while clicking around. Only issues like in the gif and later comment described sometimes still happen.
in the linked blog the guy also mentions https://github.com/elastic/kibana/issues/44062 so if elastic updates the underlying nodejs to >v12.17.0 it should be fixed without the need to add the line to the function
We constantly update nodejs versions, so I think its safe to assume this will be resolved soon :pray:
@joshporter1 any updates on this? Can this be closed?
@lizozom I applied the workaround "fix" but have not updated or removed the code to verify if it was solved in a future version. So as far as I can tell, it shouldn't be closed unless it was truly resolved without requiring users to alter the core code.
If the underlying nodejs version was upgraded then the issue should probably be re-evaluated.
As far as I can tell, this bug is being caused by https://github.com/nodejs/node/issues/27363. Once Kibana is upgraded to Node > 12.9.0, this will be resolved. There are also potential workarounds that are discussed later.
Node.js has three timeouts that apply to HTTP requests:
There's a bug in Node.js < 12.9.0 where when multiple HTTP requests occur over the same TCP socket when keepalives are used, that the headers timeout is not reset correctly. This is what the headers timeout is supposed to control:
But with the bug, this is how it's used:
I've been able to replicate this behavior against a test Node.js HTTP server, and Kibana itself using the default timeouts and a basic test client, which is included below. This effectively means that Kibana is pretty much unusable currently behind a reverse-proxy that reuses TCP sockets.
const http = require('http');
const agent = new http.Agent({
keepAlive: true,
keepAliveMsecs: 30000
});
let n = 0;
const request = () => {
const req = http.request({
protocol: 'http:',
host: 'localhost',
port: 5601,
path: '/',
method: 'GET',
agent,
}, function (res) {
let data = '';
res.on('data', (chunk) => {
data += chunk;{}
});
res.on('end', () => {
console.log(`Done: [${res.statusCode}] - ${data}`);
if (++n == 1) {
setTimeout(request, 58000);
}
})
});
req.on('socket', (socket) => {
socket.write("GET / HTTP/1.1\r\n");
setTimeout(() => {
socket.write(`user-agent: ${Math.random() * 1000}\r\n`);
}, 1000);
setTimeout(() => {
socket.write(`user-agent: ${Math.random() * 1000}\r\n`);
}, 2000);
setTimeout(() => {
socket.write('\r\n')
req.end();
}, 3000);
});
req.on('error', (error) => {
console.error('error', error);
});
};
request();
I don't really like any of these, but I'm interested in hearing your opinions @restrry.
This would allow the end-user to work-around this bug when Kibana is behind a reverse-proxy, and set it to a value higher than the server.keepaliveTimeout
. This is the work-around suggested in this issue, and in the blog-post. This leads to a nondeterministic behavior, where the first HTTP request which occurs over a TCP socket has a longer period of time to send their headers than subsequent HTTP requests. This is rather awkward because once we upgrade to Node 12.9.0, this behavior will change and headers timeouts will be used consistently and appropriately everywhere, potentially causing the value that was set by the users in the kibana.yml
to need to be adjusted.
This would still have the nondeterministic behavior as above, but we at least don't end up violating end-users expectations when we do get to upgrade to Node 12.9.0. This will potentially lead to the headers timeout being set to a much larger value than it would be by default, minimizing the protection that it provides against the slowloris attack.
It's possible to disable the headers timeout by setting this to 0. This removes the protection entirely against the slowloris attack. Removing this protection for users who aren't reusing TCP keepalives and reusing sockets is a potential change in expectation if they were relying on Kibana itself providing this protection.
I read through the nodejs thread https://github.com/nodejs/node/issues/27363 and it seems that:
Disable the headers timeout
is not an option due to security reasons.Allow the end-user to configure the headers timeout
requires the end-users to identify the problem and find a recommended solution in the Kibana docs. Furthermore, we are going to update the nodejs version to the fixed v12.19 in Kibana v7.11, but we will have to maintain the headersTimeout
setting till v8.0 for BWC reasons.Automatically configure the headers timeout to be larger than the keepalive timeout
seems to be an optimal solution: Kibana works out-of-the-box, and we can remove an internal workaround as soon as update nodejs version in the upcoming release. Regarding This will potentially lead to the headers timeout being set to a much larger value than it would be by default, minimizing the protection that it provides against the slowloris attack
. From the official docs headersTimeout is 40sec (increased to 60sec in v12), while keepAlive
timeout for Kibana is 120sec. The existing difference is not on the level of the order of magnitude to say we disable the protection against slowlorais attack completely.@kobelb WDYT about this?
const headersTimeout = listenerOptions.keepaliveTimeout + 2*server.listener.headersTimeout;
server.listener.headersTimeout = headers Timeout;
@restrry I think your suggestion is the "least bad" option that we have, unfortunately... I'll take it :)
@lizozom Could you confirm this problem has been fixed by https://github.com/elastic/kibana/pull/61587 ?
IMO yes, but I'd let @kobelb confirm as he did most of the research
Everything is looking good from my perspective. I just tested out master
and I'm seeing these timeouts working properly with TCP keepalives.
Most helpful comment
I read through the nodejs thread https://github.com/nodejs/node/issues/27363 and it seems that:
Disable the headers timeout
is not an option due to security reasons.Allow the end-user to configure the headers timeout
requires the end-users to identify the problem and find a recommended solution in the Kibana docs. Furthermore, we are going to update the nodejs version to the fixed v12.19 in Kibana v7.11, but we will have to maintain theheadersTimeout
setting till v8.0 for BWC reasons.Automatically configure the headers timeout to be larger than the keepalive timeout
seems to be an optimal solution: Kibana works out-of-the-box, and we can remove an internal workaround as soon as update nodejs version in the upcoming release. RegardingThis will potentially lead to the headers timeout being set to a much larger value than it would be by default, minimizing the protection that it provides against the slowloris attack
. From the official docs headersTimeout is 40sec (increased to 60sec in v12), whilekeepAlive
timeout for Kibana is 120sec. The existing difference is not on the level of the order of magnitude to say we disable the protection against slowlorais attack completely.@kobelb WDYT about this?