Cloud-on-k8s: Allow ECK to specify a custom readiness check for Elasticsearch and Kibana

Created on 15 Aug 2019  路  9Comments  路  Source: elastic/cloud-on-k8s

It will be helpful to be able to specify a custom readiness check in certain cases such as Centos and RHEL 7 3.10 kernel with kmem accounting to avoid pod improperly being accounted for memory consumption repeatedly by curl against an https endpoint.

Disabling curl at this time is the workaround to avoid the pod being killed, and once an alternative is identified it will be beneficial to be able to specify the alternative.

Most helpful comment

being accounted for memory consumption repeatedly by curl against an https endpoint

@mikeh-elastic good news! Teaming up with @pmoust, we got a solution that doesn't blow up memory usage (in my testing):

Citation: https://bugzilla.redhat.com/show_bug.cgi?id=1571183

If you set export NSS_SDB_USE_CACHE=no, this problem goes away because it disables the particular behavior within curl (libnss). I believe there's still a bug in Kubernetes (or the Linux Kernel) causing memory accounting to be incorrect, but at least this env var setting will prevent the curl in the readinessProbe from making this problem worse.

All 9 comments

You can do this today by specifying your own readiness probe via the pod template:

spec:
   nodes:
   - podTemplate:
        containers:
        - name: elasticsearch 
          readinessProbe: 
            ...

If you enjoy using yaml anchors, and if you have multiple nodes entries (say, one set of data nodes and one set of master-eligible nodes), you can reduce the number of times you specify readinessProbe by using yaml anchors and references:

spec:
   nodes:
   - count: 6
     podTemplate: &custom
        containers:
        - name: elasticsearch 
          readinessProbe:
            ...
  - count: 3
     podTemplate: *custom

This uses yaml anchors to tell the yaml parser that the 2nd podTemplate entry should have its value come from the first podTemplate (an anchor named custom)

I was considering closing this out but I wonder if it might be worth calling this out in the Openshift doc. @barkbay do you have an opinion one way or the other?

@mikeh-elastic if I recall correctly this was brought up because it was OOMing on v3 kernels (so rhel/centos 7) with an https health check, but not an http health check, is that right?

Yes, CentOS/RHEL 7 kernels have a bug, see also https://github.com/elastic/cloud-on-k8s/issues/1076#issuecomment-503894627
As Peter mentioned you can specify your own readiness check.
Regarding Openshift I would not add a note in the doc unless the problem also occurs on it. I don't understand if it is the case here, if it's the case then we need an other issue.

certain cases such as Centos and RHEL 7 3.10 kernel with kmem accounting

For posterity, I'm writing this here...

I'm seeing a similar issue on Google Container OS 69, kernel 4.14.127+, GKE 1.12.8-gke.10

Tests performed:

  • Default readinessProbe, pod memory usage grows by 100mb per minute.
  • /bin/true readinessProbe, pod memory usage is stable.

With the symptoms (memory growth with default readinessProbe), I see the following:

  • Large dentries in /proc/slabinfo (viewable with slabtop if desired)
  • 'Memory usage' drops dramatically if I echo 2 > /proc/sys/vm/drop_caches but then rises again.

There's a kernel bug here that seems unfixed in some set of newer kernels. We're tracking this elsewhere as an internal ticket, but in case anyone ends up here from an internet search... you aren't alone ;)

being accounted for memory consumption repeatedly by curl against an https endpoint

@mikeh-elastic good news! Teaming up with @pmoust, we got a solution that doesn't blow up memory usage (in my testing):

Citation: https://bugzilla.redhat.com/show_bug.cgi?id=1571183

If you set export NSS_SDB_USE_CACHE=no, this problem goes away because it disables the particular behavior within curl (libnss). I believe there's still a bug in Kubernetes (or the Linux Kernel) causing memory accounting to be incorrect, but at least this env var setting will prevent the curl in the readinessProbe from making this problem worse.

Relevant kubernetes issue (filesystem cache is counting _against_ pod memory usage): https://github.com/kubernetes/kubernetes/issues/43916

After some discussion with Google, our strongest hypothesis is that there's a kernel bug in how memory is accounted (or a bug in how kubernetes interprets memory accounting). Tracking it here: https://issuetracker.google.com/issues/140577001

https://github.com/elastic/cloud-on-k8s/pull/1716 sets the an environment variable which should help minimize symptoms of kernel/kubernetes memory accounting bug caused by the readinessProbe's curl.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

pebrc picture pebrc  路  3Comments

pebrc picture pebrc  路  3Comments

sebgl picture sebgl  路  5Comments

sebgl picture sebgl  路  5Comments

sebgl picture sebgl  路  3Comments