Kibana version: 7.6.0
Elasticsearch version: 7.60
Original install method: ECK _(in dev feature, not yet released)_
Describe the bug:
While configuring the Kibana endpoint in the APMServer I had an issue that leads me to discover that if the .apm-agent-configuration index can't be created during startup then Kibana do no retries:
{"type":"log","@timestamp":"2020-03-05T09:19:28Z","tags":["debug","plugins-system"],"pid":6,"message":"Setting up plugin \"apm\"..."}
{"type":"log","@timestamp":"2020-03-05T09:19:28Z","tags":["debug","plugins","apm"],"pid":6,"message":"Initializing plugin"}
{"type":"log","@timestamp":"2020-03-05T09:19:28Z","tags":["info","plugins","apm"],"pid":6,"message":"Setting up plugin"}
{"type":"log","@timestamp":"2020-03-05T09:19:28Z","tags":["debug","config"],"pid":6,"message":"Marking config path as handled: xpack,apm"}
{"type":"log","@timestamp":"2020-03-05T09:19:29Z","tags":["error","elasticsearch","data"],"pid":6,"message":"Request error, retrying\nGET https://elasticsearch-sample-es-http.default.svc:9200/_xpack => connect ECONNREFUSED 10.28.60.163:9200"}
{"type":"log","@timestamp":"2020-03-05T09:19:29Z","tags":["error","elasticsearch","admin"],"pid":6,"message":"Request error, retrying\nGET https://elasticsearch-sample-es-http.default.svc:9200/_nodes?filter_path=nodes.*.version%2Cnodes.*.http.publish_address%2Cnodes.*.ip => connect ECONNREFUSED 10.28.60.163:9200"}
{"type":"log","@timestamp":"2020-03-05T09:19:29Z","tags":["error","elasticsearch","data"],"pid":6,"message":"Request error, retrying\nHEAD https://elasticsearch-sample-es-http.default.svc:9200/.apm-agent-configuration => connect ECONNREFUSED 10.28.60.163:9200"}
{"type":"log","@timestamp":"2020-03-05T09:19:30Z","tags":["warning","elasticsearch","data"],"pid":6,"message":"Unable to revive connection: https://elasticsearch-sample-es-http.default.svc:9200/"}
{"type":"log","@timestamp":"2020-03-05T09:19:30Z","tags":["warning","elasticsearch","data"],"pid":6,"message":"No living connections"}
Could not create APM Agent configuration: No Living connections
{"type":"log","@timestamp":"2020-03-05T09:19:30Z","tags":["debug","plugins-system"],"pid":6,"message":"Setting up plugin \"graph\"..."}
It is a little bit confusing from a user point of view:
.kibana_task_manager_n, .kibana_n) 2020-03-05T13:18:48.814Z INFO kibana/client.go:117 Kibana url: https://kibana-sample-kb-http.default.svc:5601
2020-03-05T13:18:48.955Z INFO [kibana] kibana/connecting_client.go:80 Successfully obtained connection to Kibana.
...
2020-03-05T13:20:34.107Z ERROR [request] middleware/log_middleware.go:95 service unavailable {"request_id": "d4caaa64-826d-4315-be39-bca039cbb1a7", "method": "GET", "URL": "/config/v1/agents?service.name=___go_build_main_go__GKE___TMP_", "content_length": 0, "remote_address": "10.28.65.1", "user-agent": "elasticapm-go/1.7.0 go/go1.13", "response_code": 503, "error": "{\"statusCode\":500,\"error\":\"Internal Server Error\",\"message\":\"An internal server error occurred.\"}"}
2020-03-05T13:20:35.139Z ERROR [request] middleware/log_middleware.go:95 service unavailable {"request_id": "fd556306-b2cc-4adb-a2b4-c1971305e88e", "method": "GET", "URL": "/config/v1/agents?service.name=elastic-operator", "content_length": 0, "remote_address": "10.28.65.1", "user-agent": "elasticapm-go/1.7.0 go/go1.13", "response_code": 503, "error": "{\"statusCode\":500,\"error\":\"Internal Server Error\",\"message\":\"An internal server error occurred.\"}"}
While the root issue is in Kibana
{"type":"log","@timestamp":"2020-03-05T13:20:34Z","tags":["error","http"],"pid":6,"message":"{ Error: [index_not_found_exception] no such index [.apm-agent-configuration], with { resource.type=\"index_or_alias\" & resource.id=\".apm-agent-configuration\" & index_uuid=\"_na_\" & index=\".apm-agent-configuration\" }\n at respond (/usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:349:15)\n at checkRespForFailure (/usr/share/kibana/node_modules/elasticsearch/src/lib/transport.js:306:7)\n at HttpConnector.<anonymous> (/usr/share/kibana/node_modules/elasticsearch/src/lib/connectors/http.js:173:7)\n at IncomingMessage.wrapper (/usr/share/kibana/node_modules/elasticsearch/node_modules/lodash/lodash.js:4929:19)\n at IncomingMessage.emit (events.js:203:15)\n at endReadableNT (_stream_readable.js:1143:12)\n at process._tickCallback (internal/process/next_tick.js:63:19)\n status: 404,\n displayName: 'NotFound',\n message:\n '[index_not_found_exception] no such index [.apm-agent-configuration], with { resource.type=\"index_or_alias\" & resource.id=\".apm-agent-configuration\" & index_uuid=\"_na_\" & index=\".apm-agent-configuration\" }',\n path: '/.apm-agent-configuration/_search',\n query: { ignore_throttled: true },\n body:\n { error:\n { root_cause: [Array],\n type: 'index_not_found_exception',\n reason: 'no such index [.apm-agent-configuration]',\n 'resource.type': 'index_or_alias',\n 'resource.id': '.apm-agent-configuration',\n index_uuid: '_na_',\n index: '.apm-agent-configuration' },\n status: 404 },\n statusCode: 404,\n response:\n '{\"error\":{\"root_cause\":[{\"type\":\"index_not_found_exception\",\"reason\":\"no such index [.apm-agent-configuration]\",\"resource.type\":\"index_or_alias\",\"resource.id\":\".apm-agent-configuration\",\"index_uuid\":\"_na_\",\"index\":\".apm-agent-configuration\"}],\"type\":\"index_not_found_exception\",\"reason\":\"no such index [.apm-agent-configuration]\",\"resource.type\":\"index_or_alias\",\"resource.id\":\".apm-agent-configuration\",\"index_uuid\":\"_na_\",\"index\":\".apm-agent-configuration\"},\"status\":404}',\n toString: [Function],\n toJSON: [Function] }"}
In ECK, when a user creates an Elasticsearch cluster with a Kibana instance, the Kibana pod can be ready before the Elasticsearch cluster.
When it happens a workaround is to restart the Kibana Pod (when ES is ready) but I'm not sure we can easily detect this situation in ECK.
Related to the integration of APM Agent Config Management in ECK
Also kind of related to elastic/kibana#57931
Expected behavior:
Retry or postpone the creation of the .apm-agent-configuration index
Any additional context:
Issue detected in the context of https://github.com/elastic/cloud-on-k8s/issues/1264
Pinging @elastic/apm-ui (Team:apm)
Thanks for letting us aware of this @barkbay. I recently shipped a quick fix for this but it doesn't sound like it's sufficient.
I will look into this and make sure that creating the index will be retried for a sufficient amount of time.
Hi @barkbay, unfortunately, I couldn't reproduce the problem.

In the image above you can see between 14:47:52 and 14:48:07 Kibana is waiting for an ES instance to continue, and as soon it is available, it restarts. Only than APM plugin is initialized and .apm-agent-configuration is created.
These were the steps I followed to reproduce.
1) Stop ES and Kibana.
2) Start Kibana (no living connections message).
3) Start ES.
4) Kibana restarts as soon an ES instance is found.
If you can give more details on how to reproduce it I'll be happy to help.
Thanks for investigating this one.
Here are the full logs.
In my case Elasticsearch is created at the same time than Kibana, I think that's the reason why we see only one connection failure.
Also I'm not sure what is apm_oss , in my case the plugin for which the setup is failing is apm:
{"type":"log","@timestamp":"2020-03-05T09:19:28Z","tags":["info","plugins","apm"],"pid":6,"message":"Setting up plugin"}
I think I got the same issue while the Elasticsearch credentials were not yet propagated.
Here is what could happen with ECK:
.apm-agent-configuration index has already failed.{"type":"log","@timestamp":"2020-03-13T07:33:28Z","tags":["info","plugins","apm"],"pid":6,"message":"Setting up plugin"}
Could not create APM Agent configuration: Authentication Exception
{"type":"log","@timestamp":"2020-03-13T07:33:28Z","tags":["info","plugins","graph"],"pid":6,"message":"Setting up plugin"}
{"type":"log","@timestamp":"2020-03-13T07:33:28Z","tags":["info","plugins","bfetch"],"pid":6,"message":"Setting up plugin"}
{"type":"log","@timestamp":"2020-03-13T07:33:28Z","tags":["info","savedobjects-service"],"pid":6,"message":"Waiting until all Elasticsearch nodes are compatible with Kibana before starting saved objects migrations..."}
{"type":"log","@timestamp":"2020-03-13T07:33:28Z","tags":["error","elasticsearch-service"],"pid":6,"message":"Unable to retrieve version information from Elasticsearch nodes."}
{"type":"log","@timestamp":"2020-03-13T07:33:28Z","tags":["warning","plugins","licensing"],"pid":6,"message":"License information could not be obtained from Elasticsearch due to [security_exception] unable to authenticate user [ns1-kb-2-kibana-user] for REST request [/_xpack], with { header={ WWW-Authenticate={ 0=\"Bearer realm=\\\"security\\\"\" & 1=\"ApiKey\" & 2=\"Basic realm=\\\"security\\\" charset=\\\"UTF-8\\\"\" } } } :: {\"path\":\"/_xpack\",\"statusCode\":401,\"response\":\"{\\\"error\\\":{\\\"root_cause\\\":[{\\\"type\\\":\\\"security_exception\\\",\\\"reason\\\":\\\"unable to authenticate user [ns1-kb-2-kibana-user] for REST request [/_xpack]\\\",\\\"header\\\":{\\\"WWW-Authenticate\\\":[\\\"Bearer realm=\\\\\\\"security\\\\\\\"\\\",\\\"ApiKey\\\",\\\"Basic realm=\\\\\\\"security\\\\\\\" charset=\\\\\\\"UTF-8\\\\\\\"\\\"]}}],\\\"type\\\":\\\"security_exception\\\",\\\"reason\\\":\\\"unable to authenticate user [ns1-kb-2-kibana-user] for REST request [/_xpack]\\\",\\\"header\\\":{\\\"WWW-Authenticate\\\":[\\\"Bearer realm=\\\\\\\"security\\\\\\\"\\\",\\\"ApiKey\\\",\\\"Basic realm=\\\\\\\"security\\\\\\\" charset=\\\\\\\"UTF-8\\\\\\\"\\\"]}},\\\"status\\\":401}\",\"wwwAuthenticateDirective\":\"Bearer realm=\\\"security\\\", ApiKey, Basic realm=\\\"security\\\" charset=\\\"UTF-8\\\"\"} error"}
{"type":"log","@timestamp":"2020-03-13T07:33:58Z","tags":["warning","plugins","licensing"],"pid":6,"message":"License information could not be obtained from Elasticsearch due to [security_exception] unable to authenticate user [ns1-kb-2-kibana-user] for REST request [/_xpack], with { header={ WWW-Authenticate={ 0=\"Bearer realm=\\\"security\\\"\" & 1=\"ApiKey\" & 2=\"Basic realm=\\\"security\\\" charset=\\\"UTF-8\\\"\" } } } :: {\"path\":\"/_xpack\",\"statusCode\":401,\"response\":\"{\\\"error\\\":{\\\"root_cause\\\":[{\\\"type\\\":\\\"security_exception\\\",\\\"reason\\\":\\\"unable to authenticate user [ns1-kb-2-kibana-user] for REST request [/_xpack]\\\",\\\"header\\\":{\\\"WWW-Authenticate\\\":[\\\"Bearer realm=\\\\\\\"security\\\\\\\"\\\",\\\"ApiKey\\\",\\\"Basic realm=\\\\\\\"security\\\\\\\" charset=\\\\\\\"UTF-8\\\\\\\"\\\"]}}],\\\"type\\\":\\\"security_exception\\\",\\\"reason\\\":\\\"unable to authenticate user [ns1-kb-2-kibana-user] for REST request [/_xpack]\\\",\\\"header\\\":{\\\"WWW-Authenticate\\\":[\\\"Bearer realm=\\\\\\\"security\\\\\\\"\\\",\\\"ApiKey\\\",\\\"Basic realm=\\\\\\\"security\\\\\\\" charset=\\\\\\\"UTF-8\\\\\\\"\\\"]}},\\\"status\\\":401}\",\"wwwAuthenticateDirective\":\"Bearer realm=\\\"security\\\", ApiKey, Basic realm=\\\"security\\\" charset=\\\"UTF-8\\\"\"} error"}
....
{"type":"log","@timestamp":"2020-03-13T07:34:23Z","tags":["info","savedobjects-service"],"pid":6,"message":"Starting saved objects migrations"}
{"type":"log","@timestamp":"2020-03-13T07:34:24Z","tags":["info","savedobjects-service"],"pid":6,"message":"Creating index .kibana_1."}
{"type":"log","@timestamp":"2020-03-13T07:34:24Z","tags":["info","savedobjects-service"],"pid":6,"message":"Creating index .kibana_task_manager_1."}
{"type":"log","@timestamp":"2020-03-13T07:34:24Z","tags":["info","savedobjects-service"],"pid":6,"message":"Pointing alias .kibana to .kibana_1."}
{"type":"log","@timestamp":"2020-03-13T07:34:24Z","tags":["info","savedobjects-service"],"pid":6,"message":"Pointing alias .kibana_task_manager to .kibana_task_manager_1."}
{"type":"log","@timestamp":"2020-03-13T07:34:24Z","tags":["info","savedobjects-service"],"pid":6,"message":"Finished in 526ms."}
{"type":"log","@timestamp":"2020-03-13T07:34:24Z","tags":["info","savedobjects-service"],"pid":6,"message":"Finished in 536ms."}
Here are the indices the first time Kibana is started:
green open .security-7 -6KS4XYSSR6qtGrTblEPXg 1 0 36 0 109kb 109kb
green open .kibana_task_manager_1 j9qagD5eS7CI80GIeZjMng 1 0 2 0 33.9kb 33.9kb
green open .kibana_1 7W8o1WUdRh-CmyRUfpYJMg 1 0 3 0 15.3kb 15.3kb
I have to delete and restart the Kibana Pod to get the .apm-agent-configuration index created:
green open .security-7 -6KS4XYSSR6qtGrTblEPXg 1 0 36 0 109kb 109kb
green open .kibana_task_manager_1 j9qagD5eS7CI80GIeZjMng 1 0 2 0 33.9kb 33.9kb
green open .apm-agent-configuration Hg4_L-VOTJWmws5WKucCmw 1 0 0 0 230b 230b
green open .kibana_1 7W8o1WUdRh-CmyRUfpYJMg 1 0 3 1 29.7kb 29.7kb
Maybe that this scenario is easier to reproduce.
Thanks !
Thanks a lot for your information @barkbay, I'll try to reproduce it and let you know.
@barkbay have you tried with v7.6.1? As Soren mentioned here, he fixed an issue related to the creation on the . apm-agent-configuration
Same issue with 7.6.1 :
{"type":"log","@timestamp":"2020-03-26T10:17:09Z","tags":["error","elasticsearch","admin"],"pid":6,"message":"Request error, retrying\nGET https://es-apm-sample-es-http.default.svc:9200/_nodes?filter_path=nodes.*.version%2Cnodes.*.http.pu
blish_address%2Cnodes.*.ip => connect ECONNREFUSED 10.28.20.187:9200"}
{"type":"log","@timestamp":"2020-03-26T10:17:09Z","tags":["error","elasticsearch","data"],"pid":6,"message":"Request error, retrying\nHEAD https://es-apm-sample-es-http.default.svc:9200/.apm-agent-configuration => connect ECONNREFUSED 10.28.20.187:9200"}
{"type":"log","@timestamp":"2020-03-26T10:17:10Z","tags":["warning","elasticsearch","data"],"pid":6,"message":"Unable to revive connection: https://es-apm-sample-es-http.default.svc:9200/"}
{"type":"log","@timestamp":"2020-03-26T10:17:10Z","tags":["warning","elasticsearch","data"],"pid":6,"message":"No living connections"}
Could not create APM Agent configuration: No Living connections
When ES is finally available:
> GET /_cat/indices
green open .security-7 43QQpDFSSVmYoHpZ8zCghQ 1 1 36 0 236.2kb 118.1kb
green open .kibana_task_manager_1 XhhZoeaHQ4aF_LttHN_J1Q 1 1 2 6 67.2kb 33.5kb
green open .kibana_1 5PF4gUbNTDe4KZbltJLUrA 1 1 3 0 30.2kb 15.1kb
@cauemarcondes You might need to run kibana in production mode (I think in dev mode it restarts when connecting to ES which seemingly solves the problem).
You can run kibana in production mode like:
node ./scripts/kibana --no-base-path --elasticsearch.username=kibana_system_user --elasticsearch.password=changeme
and then after a while start elasticsearch via apm-it:
./scripts/compose.py start master --no-kibana --no-apm-server
I think the solution is to use something like p-retry to retry the index creation operation.
This should be done in https://github.com/elastic/kibana/blob/master/x-pack/plugins/apm/server/lib/helpers/create_or_update_index.ts. Either wrap createOrUpdateIndex, createNewIndex/updateExistingIndex or callAsInternalUser in the retry logic.
This should go into 7.7 as a bug fix.
@barkbay I've just backported the fix to 7.7 if you want to try it out.
Thanks a lot ! I will give it a try.
It will keep trying to create the index for 10 times, otherwise, it gives up and logs an error.
I am getting this issue with version: 7.6.2;
{"type":"log","@timestamp":"2020-05-06T04:50:46Z","tags":["info","savedobjects-service"],"pid":6,"message":"Waiting` until all Elasticsearch nodes are compatible with Kibana before starting saved objects migrations..."}
Could not create APM Agent configuration: Authentication Exception
{"type":"log","@timestamp":"2020-05-06T04:50:46Z","tags":["error","savedobjects-service"],"pid":6,"message":"Unable to retrieve version information from Elasticsearch nodes."}
{"type":"log","@timestamp":"2020-05-06T04:50:46Z","tags":["warning","plugins","licensing"],"pid":6,"message":"License information could not be obtained from Elasticsearch due to [security_exception] unable to authenticate user [elk-elk-kibana-kibana-user] for REST request [/_xpack], with { header={ WWW-Authenticate={ 0=\"Bearer realm=\\\"security\\\"\" & 1=\"ApiKey\" & 2=\"Basic realm=\\\"security\\\" charset=\\\"UTF-8\\\"\" } } } :: {\"path\":\"/_xpack\",\"statusCode\":401,\"response\":\"{\\\"error\\\":{\\\"root_cause\\\":[{\\\"type\\\":\\\"security_exception\\\",\\\"reason\\\":\\\"unable to authenticate user [elk-elk-kibana-kibana-user] for REST request [/_xpack]\\\",\\\"header\\\":{\\\"WWW-Authenticate\\\":[\\\"Bearer realm=\\\\\\\"security\\\\\\\"\\\",\\\"ApiKey\\\",\\\"Basic realm=\\\\\\\"security\\\\\\\" charset=\\\\\\\"UTF-8\\\\\\\"\\\"]}}],\\\"type\\\":\\\"security_exception\\\",\\\"reason\\\":\\\"unable to authenticate user [elk-elk-kibana-kibana-user] for REST request [/_xpack]\\\",\\\"header\\\":{\\\"WWW-Authenticate\\\":[\\\"Bearer realm=\\\\\\\"security\\\\\\\"\\\",\\\"ApiKey\\\",\\\"Basic realm=\\\\\\\"security\\\\\\\" charset=\\\\\\\"UTF-8\\\\\\\"\\\"]}},\\\"status\\\":401}\",\"wwwAuthenticateDirective\":\"Bearer realm=\\\"security\\\", ApiKey, Basic realm=\\\"security\\\" charset=\\\"UTF-8\\\"\"} error"}
{"type":"log","@timestamp":"2020-05-06T04:51:16Z","tags":["warning","plugins","licensing"],"pid":6,"message":"License information could not be obtained from Elasticsearch due to [security_exception] unable to authenticate user [elk-elk-kibana-kibana-user] for REST request [/_xpack], with { header={ WWW-Authenticate={ 0=\"Bearer realm=\\\"security\\\"\" & 1=\"ApiKey\" & 2=\"Basic realm=\\\"security\\\" charset=\\\"UTF-8\\\"\" } } } :: {\"path\":\"/_xpack\",\"statusCode\":401,\"response\":\"{\\\"error\\\":{\\\"root_cause\\\":[{\\\"type\\\":\\\"security_exception\\\",\\\"reason\\\":\\\"unable to authenticate user [elk-elk-kibana-kibana-user] for REST request [/_xpack]\\\",\\\"header\\\":{\\\"WWW-Authenticate\\\":[\\\"Bearer realm=\\\\\\\"security\\\\\\\"\\\",\\\"ApiKey\\\",\\\"Basic realm=\\\\\\\"security\\\\\\\" charset=\\\\\\\"UTF-8\\\\\\\"\\\"]}}],\\\"type\\\":\\\"security_exception\\\",\\\"reason\\\":\\\"unable to authenticate user [elk-elk-kibana-kibana-user] for REST request [/_xpack]\\\",\\\"header\\\":{\\\"WWW-Authenticate\\\":[\\\"Bearer realm=\\\\\\\"security\\\\\\\"\\\",\\\"ApiKey\\\",\\\"Basic realm=\\\\\\\"security\\\\\\\" charset=\\\\\\\"UTF-8\\\\\\\"\\\"]}},\\\"status\\\":401}\",\"wwwAuthenticateDirective\":\"Bearer realm=\\\"security\\\", ApiKey, Basic realm=\\\"security\\\" charset=\\\"UTF-8\\\"\"} error"}
Hi @alwaysastudent,
Please upgrade to 7.7 to find this fixed.
Most helpful comment
@cauemarcondes You might need to run kibana in production mode (I think in dev mode it restarts when connecting to ES which seemingly solves the problem).
You can run kibana in production mode like:
and then after a while start elasticsearch via apm-it:
I think the solution is to use something like p-retry to retry the index creation operation.
This should be done in https://github.com/elastic/kibana/blob/master/x-pack/plugins/apm/server/lib/helpers/create_or_update_index.ts. Either wrap
createOrUpdateIndex,createNewIndex/updateExistingIndexorcallAsInternalUserin the retry logic.This should go into 7.7 as a bug fix.