Elasticsearch: Elasticsearch fails to start with selinux enforcing

Created on 9 Jan 2017  Â·  20Comments  Â·  Source: elastic/elasticsearch

Elasticsearch version:
5.1.1

Plugins installed:
None

JVM version:
openjdk version "1.8.0_111"

OS version:
Fedora 25

Description of the problem including expected versus actual behavior:
When selinux is enforcing, elasticsearch fails to start on Fedora. This is solved by setting it to permissive, but obviously that is not a good solution. The expected behaviour is that it works properly with selinux enabled.

No denials are logged to the audit log.

Steps to reproduce:

  1. Install Elasticsearch from the elastic.co repository
  2. Attempt to start it with selinux enabled
  3. Fail

Provide logs (if relevant):
Journal: Jan 08 22:07:04 lf-logs elasticsearch[11529]: Exception: java.security.AccessControlException thrown from the UncaughtExceptionHandler in thread "Thread-3"

feedback_needed

All 20 comments

can you provide the full stacktrace or the entire log that you see.

I run with no issues on Fedora 25 with SELinux enabled and enforcing. So as @s1monw requested, we will need more information here. Are you using a custom policy or the default policy? What led you to discover that SELinux is the issue? Can you share your mount options for /tmp?

Put as simply as possible: it starts on permissive and not on enforcing.
The OS install is fresh, I can even share the ansible playbook if desired.

Enforcing:

```[lf@lf-logs ~]$ sudo tail /var/log/elasticsearch/lf-logging.log
[2017-01-09T07:20:15,984][INFO ][o.e.p.PluginsService ] [lf-logs] no
plugins loaded
[2017-01-09T07:20:18,783][INFO ][o.e.n.Node ] [lf-logs]
initialized
[2017-01-09T07:20:18,784][INFO ][o.e.n.Node ] [lf-logs]
starting ...
[2017-01-09T07:20:18,830][INFO ][i.n.u.i.PlatformDependent] Your platform
does not provide complete low-level API for accessing direct buffers
reliably. Unless explicitly requested, heap buffer will always be preferred
to avoid potential system unstability.
[2017-01-09T07:20:18,989][INFO ][o.e.t.TransportService ] [lf-logs]
publish_address {10.240.11.36:9300}, bound_addresses {[::]:9300}
[2017-01-09T07:20:18,996][INFO ][o.e.b.BootstrapCheck ] [lf-logs] bound
or publishing to a non-loopback or non-link-local address, enforcing
bootstrap checks
[2017-01-09T07:20:18,999][ERROR][o.e.b.Bootstrap ] [lf-logs] node
validation exception
bootstrap checks failed
memory locking requested for elasticsearch process but memory is not locked
[2017-01-09T07:20:19,011][INFO ][o.e.n.Node ] [lf-logs]
stopping ...

Permissive:

[2017-01-09T07:23:32,956][INFO ][o.e.n.Node ] [lf-logs]
initializing ...
[2017-01-09T07:23:33,098][INFO ][o.e.e.NodeEnvironment ] [lf-logs] using
[1] data paths, mounts [[/ (/dev/mapper/fedora-root)]], net usable_space
[13.5gb], net total_space [14.9gb], spins? [possibly], types [xfs]
[2017-01-09T07:23:33,098][INFO ][o.e.e.NodeEnvironment ] [lf-logs] heap
size [1.9gb], compressed ordinary object pointers [true]
[2017-01-09T07:23:33,100][INFO ][o.e.n.Node ] [lf-logs] node
name [lf-logs], node ID [uz3d-3jaSHWLO9JTgmiekg]
[2017-01-09T07:23:33,104][INFO ][o.e.n.Node ] [lf-logs]
version[5.1.1], pid[12259], build[5395e21/2016-12-06T12:36:15.409Z],
OS[Linux/4.8.6-300.fc25.x86_64/amd64], JVM[Oracle Corporation/OpenJDK
64-Bit Server VM/1.8.0_111/25.111-b16]
```

On 9 Jan 2017 6:17 AM, "Jason Tedor" notifications@github.com wrote:

I run with no issues on Fedora 25 with SELinux enabled. So as @s1monw
https://github.com/s1monw requested, we will need more information
here. Are you using a custom policy or the default policy. What led you to
discover that SELinux is the issue? Can you share your mount options for
/tmp?

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/elastic/elasticsearch/issues/22493#issuecomment-271270932,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGWDqEPHdulQh7jgL0Z3quqykJRp4yf9ks5rQiVZgaJpZM4Ld7B5
.

Put as simply as possible: it starts on permissive and not on enforcing.

You are misunderstanding my question. I'm asking what provoked you to consider setting SELinux to permissive in the first place.

It appears to be stopping cleanly with SELinux set to enforcing. Can you please enable trace logging (-E logger.level=trace) and share the logs in a gist?

Force of habit: try setting it to permissive as a litmus test for whether
it's actually broken.

On 9 Jan 2017 7:59 AM, "Jason Tedor" notifications@github.com wrote:

Put as simply as possible: it starts on permissive and not on enforcing.

You are misunderstanding my question. I'm asking what provoked you to
consider setting SELinux to permissive in the first place.

It appears to be stopping cleanly with SELinux set to enforcing. Can you
please enable trace logging (`-E logger.level=trace) and share the logs in
a gist?

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/elastic/elasticsearch/issues/22493#issuecomment-271290487,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGWDqO9zmsrFG4Fs1FYEZcjEtE0gIB_dks5rQj0wgaJpZM4Ld7B5
.

You are binding to non-localhost and you have bootstrap.memory_lock configured to true but memory locking is failing (presumably because of SELinux policy or poor interaction between SELinux and JNA).

Poor interaction between selinux and jna. I configured it per the docs with
systemd.

On 10 Jan 2017 7:24 AM, "Jason Tedor" notifications@github.com wrote:

You are binding to non-localhost and you have bootstrap.memory_lock
configured to true but memory locking is failing (presumably because of
SELinux policy or poor interaction between SELinux and JNA).

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/elastic/elasticsearch/issues/22493#issuecomment-271574277,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGWDqNvG39oRUNIan1WpZgfxX-bleUwaks5rQ4Z9gaJpZM4Ld7B5
.

I'm not saying that you configured it wrongly. I'm saying that there are issues with SELinux and JNA interacting poorly in some OS (relates #18406).

I played with it some more and it would appear that my ansible playbook running and not changing anything crashes the service, but not with selinux permissive. I want a drink.

That is not a poor interaction that I can say I've seen before. 🥂

Well, regardless, here's your logs: https://gist.github.com/lf-/5a8ca512baba708bba60c151648d5370. Looks like it's a problem with java and not with elasticsearch given just how early in startup it's logged.

Your problem is here:

[2017-01-10T22:25:26,650][WARN ][o.e.b.JNANatives         ] Unable to lock JVM Memory: error=12, reason=Cannot allocate memory
[2017-01-10T22:25:26,651][WARN ][o.e.b.JNANatives         ] This can result in part of the JVM being swapped out.
[2017-01-10T22:25:26,651][WARN ][o.e.b.JNANatives         ] Increase RLIMIT_MEMLOCK, soft limit: 65536, hard limit: unlimited
[2017-01-10T22:25:26,651][WARN ][o.e.b.JNANatives         ] These can be adjusted by modifying /etc/security/limits.conf, for example: 
    # allow user 'elasticsearch' mlockall
    elasticsearch soft memlock unlimited
    elasticsearch hard memlock unlimited

As mentioned, memory locking is failing yet you've configured it to be enabled. When you bind to non-localhost, request memory locking, and memory locking fails, we do not silently fail. Either disable memory locking or fix the whatever is preventing the limits from being increased.

Also, I notice that you seem to not be running with the default jvm.options. This leads to all the access control exceptions that you see. These are unrelated to the issue that you're experiencing, but it's something that you'll want to address.

Literally every single one of the things are enabled down to the SysV init ones that probably don't do anything. As I said, that message at the top doesn't appear with selinux on permissive. It's getting ENOMEM only when selinux is enforcing, which is the entertaining part of this issue.

Also, I can verify the memlock not being misconfigured by looking at /proc/$(pgrep java)/limits which does indeed confirm memlock as unlimited.

Edit:

[lf@lf-logs ~]$ sudo systemctl start elasticsearch && cat /proc/$(pgrep java)/limits
Limit                     Soft Limit           Hard Limit           Units
Max cpu time              unlimited            unlimited            seconds
Max file size             unlimited            unlimited            bytes
Max data size             unlimited            unlimited            bytes
Max stack size            8388608              unlimited            bytes
Max core file size        0                    unlimited            bytes
Max resident set          unlimited            unlimited            bytes
Max processes             2048                 2048                 processes
Max open files            65536                65536                files
Max locked memory         65536                unlimited            bytes
Max address space         unlimited            unlimited            bytes
Max file locks            unlimited            unlimited            locks
Max pending signals       19538                19538                signals
Max msgqueue size         819200               819200               bytes
Max nice priority         0                    0
Max realtime priority     0                    0
Max realtime timeout      unlimited            unlimited            us
[lf@lf-logs ~]$ grep elasticsearch /etc/security/limits.conf
elasticsearch soft memlock unlimited
elasticsearch hard memlock unlimited

Also, I can verify the memlock not being misconfigured by looking at /proc/$(pgrep java)/limits which does indeed confirm memlock as unlimited.

I didn't say it was misconfigured, I said something is preventing the limit from being increased (which includes but is not exclusive to the possibility of it being misconfigured). Also, your own output shows the soft limit is not increased to unlimited, instead it's 65536.

Max locked memory 65536 unlimited bytes

As I said, that message at the top doesn't appear with selinux on permissive.

Then if you're sure it's configured correctly, and the only difference between success and failure is SELinux, then you probably have your culprit.

Hmmm with permissive:

[lf@lf-logs ~]$ sudo systemctl start elasticsearch && cat /proc/$(pgrep java)/limits
Limit                     Soft Limit           Hard Limit           Units
Max cpu time              unlimited            unlimited            seconds
Max file size             unlimited            unlimited            bytes
Max data size             unlimited            unlimited            bytes
Max stack size            8388608              unlimited            bytes
Max core file size        unlimited            unlimited            bytes
Max resident set          unlimited            unlimited            bytes
Max processes             2048                 2048                 processes
Max open files            65536                65536                files
Max locked memory         unlimited            unlimited            bytes
Max address space         unlimited            unlimited            bytes
Max file locks            unlimited            unlimited            locks
Max pending signals       19538                19538                signals
Max msgqueue size         819200               819200               bytes
Max nice priority         0                    0
Max realtime priority     0                    0
Max realtime timeout      unlimited            unlimited            us

Soft limit is unlimited but with enforcing it's 65536... WAT.

Soft limit is unlimited but with enforcing it's 65536... WAT.

I mean, that's pretty conclusive. This is not an Elasticsearch issue though. :smile:

It looks like this bug.

It isn't. That is systemd --user. My bug is reported at https://bugzilla.redhat.com/show_bug.cgi?id=1415045. For now, I'll just set it to permissive and be sad.

Was this page helpful?
0 / 5 - 0 ratings