Cloud-on-k8s: APMServer keystore cannot be initialized in version 7.9.0

Created on 30 Jun 2020  路  8Comments  路  Source: elastic/cloud-on-k8s

https://devops-ci.elastic.co/job/cloud-on-k8s-e2e-tests-snapshot-versions/87/testReport/github/com_elastic_cloud-on-k8s_test_e2e_apm/TestUpdateConfiguration_APM_Pod_should_be_recreated/

=== RUN   TestUpdateConfiguration/APM_Pod_should_be_recreated
Retries (5m0s timeout): ....................................................................................................
    TestUpdateConfiguration/APM_Pod_should_be_recreated: utils.go:84: 
            Error Trace:    utils.go:84
            Error:          Received unexpected error:
                            1 APM pod expected, got 2
            Test:           TestUpdateConfiguration/APM_Pod_should_be_recreated
{"log.level":"error","@timestamp":"2020-06-30T00:59:52.914Z","message":"stopping early","service.version":"0.0.0-00000000","service.type":"eck","ecs.version":"1.4.0","error":"test failure","error.stack_trace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:128\ngithub.com/elastic/cloud-on-k8s/test/e2e/test.StepList.RunSequential\n\t/go/src/github.com/elastic/cloud-on-k8s/test/e2e/test/step.go:43\ngithub.com/elastic/cloud-on-k8s/test/e2e/apm.TestUpdateConfiguration\n\t/go/src/github.com/elastic/cloud-on-k8s/test/e2e/apm/configuration_test.go:197\ntesting.tRunner\n\t/usr/local/go/src/testing/testing.go:991"}
    --- FAIL: TestUpdateConfiguration/APM_Pod_should_be_recreated (300.00s)
>bug

Most helpful comment

It seems fixed now, the latest nightly build didn't fail 馃コ
Thanks @simitt @jsoriano.

All 8 comments

I can reproduce on 7.9.0-SNAPSHOT (it works fine on 7.8.0).
The keystore init container fails with:

+ keystore_initialized_flag=/usr/share/apm-server/data/elastic-internal-init-keystore.ok
+ [[ -f /usr/share/apm-server/data/elastic-internal-init-keystore.ok ]]
+ echo 'Initializing keystore.'
+ /usr/share/apm-server/apm-server keystore create --force
Initializing keystore.
error initializing beat: error loading config file: config file ("apm-server.yml") can only be writable by the owner but the permissions are "-rw-rw----" (to fix the permissions use: 'chmod go-w /usr/share/apm-server/apm-server.yml')

Files are owned by apm-server in version 7.8.0:

bash-4.2$ ls -la
total 97408
drwxr-x--- 1 root apm-server     4096 Jun 14 17:15 .
drwxr-xr-x 1 root root           4096 Jun 14 17:15 ..
-rw-r----- 1 root apm-server       41 Jun 14 17:14 .build_hash.txt
-rw-r----- 1 root apm-server    13675 Jun 14 17:14 LICENSE.txt
-rw-r----- 1 root apm-server   157358 Jun 14 17:14 NOTICE.txt
-rw-r----- 1 root apm-server      660 Jun 14 17:14 README.md
-rwxr-x--- 1 root apm-server 99282096 Jun 14 17:14 apm-server
-rw-r----- 1 root apm-server    47503 Jun 14 17:14 apm-server.yml
drwxrwx--- 2 root apm-server     4096 Jun 14 17:15 data
-rw-r----- 1 root apm-server   206047 Jun 14 17:14 fields.yml
drwxr-x--- 1 root apm-server     4096 Jun 14 17:14 ingest
drwxrwx--- 2 root apm-server     4096 Jun 14 17:15 logs

and owned by root in 7.9.0-SNAPSHOT:

bash-4.2$ ls -la
total 98776
drwxrwx--- 1 root root      4096 Jun 28 05:56 .
drwxr-xr-x 1 root root      4096 Jun 28 05:56 ..
-rw-rw---- 1 root root        41 Jun 28 05:55 .build_hash.txt
-rw-rw---- 1 root root     13675 Jun 28 05:55 LICENSE.txt
-rw-rw---- 1 root root    159237 Jun 28 05:55 NOTICE.txt
-rw-rw---- 1 root root       669 Jun 28 05:55 README.md
-rwxrwx--- 1 root root 100681264 Jun 28 05:55 apm-server
-rw-rw---- 1 root root     48413 Jun 28 05:55 apm-server.yml
drwxrwx--- 2 root root      4096 Jun 28 05:56 data
-rw-rw---- 1 root root    206201 Jun 28 05:55 fields.yml
drwxrwx--- 1 root root      4096 Jun 28 05:55 ingest
drwxrwx--- 2 root root      4096 Jun 28 05:56 logs

It's odd, looks like a side effect of https://github.com/elastic/beats/pull/18873 but it is not merged yet

Another identified issue in Beats: https://github.com/elastic/beats/issues/18858 (cc @simitt)

It seems that the 7.x branch of apm-server is using a version of beats with https://github.com/elastic/beats/pull/12905, what provoked https://github.com/elastic/beats/issues/18858. This change was reverted on https://github.com/elastic/beats/pull/18872 to solve this issue.
Would it be possible to update beats in apm-server to a changeset including the reverted change?

thanks @jsoriano, I'll merge the update once CI has passed https://github.com/elastic/apm-server/pull/3924.

It seems fixed now, the latest nightly build didn't fail 馃コ
Thanks @simitt @jsoriano.

There was another update related to this, I just merged the PR pulling in the changes in libbeat. LMK in case issues arise again.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

sebgl picture sebgl  路  5Comments

nkvoll picture nkvoll  路  4Comments

spencergilbert picture spencergilbert  路  3Comments

pebrc picture pebrc  路  5Comments

Pandoraemon picture Pandoraemon  路  5Comments