Elasticsearch version:
5.1.1
Plugins installed:
None
JVM version:
openjdk version "1.8.0_111"
OS version:
Fedora 25
Description of the problem including expected versus actual behavior:
When selinux is enforcing, elasticsearch fails to start on Fedora. This is solved by setting it to permissive, but obviously that is not a good solution. The expected behaviour is that it works properly with selinux enabled.
No denials are logged to the audit log.
Steps to reproduce:
Provide logs (if relevant):
Journal: Jan 08 22:07:04 lf-logs elasticsearch[11529]: Exception: java.security.AccessControlException thrown from the UncaughtExceptionHandler in thread "Thread-3"
can you provide the full stacktrace or the entire log that you see.
I run with no issues on Fedora 25 with SELinux enabled and enforcing. So as @s1monw requested, we will need more information here. Are you using a custom policy or the default policy? What led you to discover that SELinux is the issue? Can you share your mount options for /tmp?
Put as simply as possible: it starts on permissive and not on enforcing.
The OS install is fresh, I can even share the ansible playbook if desired.
Enforcing:
```[lf@lf-logs ~]$ sudo tail /var/log/elasticsearch/lf-logging.log
[2017-01-09T07:20:15,984][INFO ][o.e.p.PluginsService ] [lf-logs] no
plugins loaded
[2017-01-09T07:20:18,783][INFO ][o.e.n.Node ] [lf-logs]
initialized
[2017-01-09T07:20:18,784][INFO ][o.e.n.Node ] [lf-logs]
starting ...
[2017-01-09T07:20:18,830][INFO ][i.n.u.i.PlatformDependent] Your platform
does not provide complete low-level API for accessing direct buffers
reliably. Unless explicitly requested, heap buffer will always be preferred
to avoid potential system unstability.
[2017-01-09T07:20:18,989][INFO ][o.e.t.TransportService ] [lf-logs]
publish_address {10.240.11.36:9300}, bound_addresses {[::]:9300}
[2017-01-09T07:20:18,996][INFO ][o.e.b.BootstrapCheck ] [lf-logs] bound
or publishing to a non-loopback or non-link-local address, enforcing
bootstrap checks
[2017-01-09T07:20:18,999][ERROR][o.e.b.Bootstrap ] [lf-logs] node
validation exception
bootstrap checks failed
memory locking requested for elasticsearch process but memory is not locked
[2017-01-09T07:20:19,011][INFO ][o.e.n.Node ] [lf-logs]
stopping ...
Permissive:
[2017-01-09T07:23:32,956][INFO ][o.e.n.Node ] [lf-logs]
initializing ...
[2017-01-09T07:23:33,098][INFO ][o.e.e.NodeEnvironment ] [lf-logs] using
[1] data paths, mounts [[/ (/dev/mapper/fedora-root)]], net usable_space
[13.5gb], net total_space [14.9gb], spins? [possibly], types [xfs]
[2017-01-09T07:23:33,098][INFO ][o.e.e.NodeEnvironment ] [lf-logs] heap
size [1.9gb], compressed ordinary object pointers [true]
[2017-01-09T07:23:33,100][INFO ][o.e.n.Node ] [lf-logs] node
name [lf-logs], node ID [uz3d-3jaSHWLO9JTgmiekg]
[2017-01-09T07:23:33,104][INFO ][o.e.n.Node ] [lf-logs]
version[5.1.1], pid[12259], build[5395e21/2016-12-06T12:36:15.409Z],
OS[Linux/4.8.6-300.fc25.x86_64/amd64], JVM[Oracle Corporation/OpenJDK
64-Bit Server VM/1.8.0_111/25.111-b16]
```
On 9 Jan 2017 6:17 AM, "Jason Tedor" notifications@github.com wrote:
I run with no issues on Fedora 25 with SELinux enabled. So as @s1monw
https://github.com/s1monw requested, we will need more information
here. Are you using a custom policy or the default policy. What led you to
discover that SELinux is the issue? Can you share your mount options for
/tmp?—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/elastic/elasticsearch/issues/22493#issuecomment-271270932,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGWDqEPHdulQh7jgL0Z3quqykJRp4yf9ks5rQiVZgaJpZM4Ld7B5
.
Put as simply as possible: it starts on permissive and not on enforcing.
You are misunderstanding my question. I'm asking what provoked you to consider setting SELinux to permissive in the first place.
It appears to be stopping cleanly with SELinux set to enforcing. Can you please enable trace logging (-E logger.level=trace) and share the logs in a gist?
Force of habit: try setting it to permissive as a litmus test for whether
it's actually broken.
On 9 Jan 2017 7:59 AM, "Jason Tedor" notifications@github.com wrote:
Put as simply as possible: it starts on permissive and not on enforcing.
You are misunderstanding my question. I'm asking what provoked you to
consider setting SELinux to permissive in the first place.
It appears to be stopping cleanly with SELinux set to enforcing. Can you
please enable trace logging (`-E logger.level=trace) and share the logs in
a gist?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/elastic/elasticsearch/issues/22493#issuecomment-271290487,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGWDqO9zmsrFG4Fs1FYEZcjEtE0gIB_dks5rQj0wgaJpZM4Ld7B5
.
You are binding to non-localhost and you have bootstrap.memory_lock configured to true but memory locking is failing (presumably because of SELinux policy or poor interaction between SELinux and JNA).
Poor interaction between selinux and jna. I configured it per the docs with
systemd.
On 10 Jan 2017 7:24 AM, "Jason Tedor" notifications@github.com wrote:
You are binding to non-localhost and you have bootstrap.memory_lock
configured to true but memory locking is failing (presumably because of
SELinux policy or poor interaction between SELinux and JNA).—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/elastic/elasticsearch/issues/22493#issuecomment-271574277,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGWDqNvG39oRUNIan1WpZgfxX-bleUwaks5rQ4Z9gaJpZM4Ld7B5
.
I'm not saying that you configured it wrongly. I'm saying that there are issues with SELinux and JNA interacting poorly in some OS (relates #18406).
I played with it some more and it would appear that my ansible playbook running and not changing anything crashes the service, but not with selinux permissive. I want a drink.
That is not a poor interaction that I can say I've seen before. 🥂
Well, regardless, here's your logs: https://gist.github.com/lf-/5a8ca512baba708bba60c151648d5370. Looks like it's a problem with java and not with elasticsearch given just how early in startup it's logged.
Your problem is here:
[2017-01-10T22:25:26,650][WARN ][o.e.b.JNANatives ] Unable to lock JVM Memory: error=12, reason=Cannot allocate memory
[2017-01-10T22:25:26,651][WARN ][o.e.b.JNANatives ] This can result in part of the JVM being swapped out.
[2017-01-10T22:25:26,651][WARN ][o.e.b.JNANatives ] Increase RLIMIT_MEMLOCK, soft limit: 65536, hard limit: unlimited
[2017-01-10T22:25:26,651][WARN ][o.e.b.JNANatives ] These can be adjusted by modifying /etc/security/limits.conf, for example:
# allow user 'elasticsearch' mlockall
elasticsearch soft memlock unlimited
elasticsearch hard memlock unlimited
As mentioned, memory locking is failing yet you've configured it to be enabled. When you bind to non-localhost, request memory locking, and memory locking fails, we do not silently fail. Either disable memory locking or fix the whatever is preventing the limits from being increased.
Also, I notice that you seem to not be running with the default jvm.options. This leads to all the access control exceptions that you see. These are unrelated to the issue that you're experiencing, but it's something that you'll want to address.
Literally every single one of the things are enabled down to the SysV init ones that probably don't do anything. As I said, that message at the top doesn't appear with selinux on permissive. It's getting ENOMEM only when selinux is enforcing, which is the entertaining part of this issue.
Also, I can verify the memlock not being misconfigured by looking at /proc/$(pgrep java)/limits which does indeed confirm memlock as unlimited.
Edit:
[lf@lf-logs ~]$ sudo systemctl start elasticsearch && cat /proc/$(pgrep java)/limits
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 2048 2048 processes
Max open files 65536 65536 files
Max locked memory 65536 unlimited bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 19538 19538 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
[lf@lf-logs ~]$ grep elasticsearch /etc/security/limits.conf
elasticsearch soft memlock unlimited
elasticsearch hard memlock unlimited
Also, I can verify the memlock not being misconfigured by looking at
/proc/$(pgrep java)/limitswhich does indeed confirm memlock as unlimited.
I didn't say it was misconfigured, I said something is preventing the limit from being increased (which includes but is not exclusive to the possibility of it being misconfigured). Also, your own output shows the soft limit is not increased to unlimited, instead it's 65536.
Max locked memory 65536 unlimited bytes
As I said, that message at the top doesn't appear with selinux on permissive.
Then if you're sure it's configured correctly, and the only difference between success and failure is SELinux, then you probably have your culprit.
Hmmm with permissive:
[lf@lf-logs ~]$ sudo systemctl start elasticsearch && cat /proc/$(pgrep java)/limits
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size unlimited unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 2048 2048 processes
Max open files 65536 65536 files
Max locked memory unlimited unlimited bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 19538 19538 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
Soft limit is unlimited but with enforcing it's 65536... WAT.
Soft limit is unlimited but with enforcing it's 65536... WAT.
I mean, that's pretty conclusive. This is not an Elasticsearch issue though. :smile:
It looks like this bug.
It isn't. That is systemd --user. My bug is reported at https://bugzilla.redhat.com/show_bug.cgi?id=1415045. For now, I'll just set it to permissive and be sad.