Describe the feature: I can't running elasticsearch after update ubuntu and java
Elasticsearch version (bin/elasticsearch --version
): 2.4.1
Plugins installed: []
JVM version (java -version
): OpenJDK 1.8.0_151
OS version (uname -a
if on a Unix-like system): Linux 4.13.0-31-generic #34~16.04.1-Ubuntu x86_64 GNU/Linux
Description of the problem including expected versus actual behavior: Normally I can run elasticsearch with the ./elasticsearch
command. But this morning, after updating ubuntu and java, I was not able to boot up elasticsearch anymore.
Please describe the problem you are having, otherwise it's very hard to help.
I have the same problem since OS update this morning. Befor the update all was fine.
Elasticsearch version (bin/elasticsearch --version): 2.4.6
JVM version (java -version): 1.8.0_131
OS version (uname -a if on a Unix-like system): Linux dellXPS 4.13.0-31-generic #34-Ubuntu 17.10
Description of the problem including expected versus actual behavior:
./elasticsearch did nothing. No message, no log entry.
I noticed that with the update also an update of intels microcode was installed:
intel-microcode 3.20180108.0+really20170707ubuntu17.10.1
can you folks give us some logging information otherwise everything is a shot in the dark
There is no logging anymore. I start ./elasticsearch
in a terminal and after a second it comes back without any messages. Logfiles are empty also with DEBUG.
I also tried without any java params: /usr/lib/jvm/java-8-oracle/bin/java -Dfile.encoding=UTF-8 -Des.path.home=/home/xxx/elasticsearch-2.4.6 -cp /home/xxx/programs/elasticsearch-2.4.6/lib/elasticsearch-2.4.6.jar:/home/xxx/elasticsearch-2.4.6/lib/* org.elasticsearch.bootstrap.Elasticsearch start
without message
Sorry - when I started writing my bug report there was only the first entry and I wasn't sure this was the same bug. But looking at the two more recent entries, I think it is. I do have a related system log error. See https://github.com/elastic/elasticsearch/issues/28354
I think you should check your kernel log messages (e.g., dmesg
).
Yep, this is the same issue with the same kernel output.
So, meltdown has killed my ES!
Sorry, this is not an Elasticsearch issue then. I suggest rolling back your kernel.
I also suspect the microcode update from intel is involved.
Thank for your support! But I used docker elasticsearch instead, hoping to fix the kenel soon.
The kernel is that of your underlying host, it does not have anything to do with the image.
For the record, another report of this here: https://discuss.elastic.co/t/kernel-update-on-ubuntu-17-10-causes-elasticsearch-to-panic/116865
Running Linux with last kernel 4.13.0-25-generic and ES works fine again!
Same for us, 4.13.0-25-generic works, upgrading to 4.13.0-31-generic fails.
The crash is caused by a non-existant syscall, which has been removed in ES 5.6.4
This is the commit that fixes the bug: https://github.com/elastic/elasticsearch/commit/b1d5e85dc4b4e52c6c1bbc62996eaca83ed2e96d
Any chance to have this fix in ES 2?
Any chance to have this fix in ES 2?
we won't release another version of 2.x
I am sorry, but that is not the issue. It might be a workaround for the issue (I have not verified), but it is not the bug. The bug is the kernel here. The error message is that the kernel tried to execute a page marked with NX, the kernel should never ever do that. Invoking a non-existant system call should not result in the kernel trying to execute a page marked NX, ever. Instead, invoking a non-existant syscall should return -ENOSYS
(see sys_ni_syscall
in the kernel execution of syscalls) (except in Docker where the process gets a SIGSYS
if a system call filter catches the system call).
We're looking into this, the linux-azure kernel also has a similar issue. The Meltdown and Spectre changes may have introduced a regression.
Thanks @shirgall.
@chaudum indeed, I gotta say that upgrading to 5.6.4 fixed the problem.
No it didn’t. You have a buggy kernel, how can you trust it to do anything correctly?
If anyone is working with GCP instances, using Ubuntu 16.04 LTS, I downgraded the kernel from:
uname -a
Linux elasticsearch-1-vm 4.13.0-1007-gcp #10-Ubuntu SMP Fri Jan 12 13:56:47 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
To:
uname -a
Linux elasticsearch-1-vm 4.13.0-1006-gcp #9-Ubuntu SMP Mon Jan 8 21:13:15 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
To fix the issue with the GCP instances, I ran:
sudo apt remove 4.13.0-1007-gcp
sudo apt install 4.13.0-1006-gcp
exit
Then in google cloud console, restart the instance, then SSH back in then:
sudo service elasticsearch start
And on the Azure side, the “linux-azure” kernel 4.13.0-1007 was put out today and seems to have fixed this issue. Ubuntu updated all of the other kernels as well.
Thanks, --jrp
From: deppi notifications@github.com
Sent: Thursday, January 25, 2018 11:19
To: elastic/elasticsearch elasticsearch@noreply.github.com
Cc: Josh Poulson jopoulso@microsoft.com; Mention mention@noreply.github.com
Subject: Re: [elastic/elasticsearch] Can't running elasticsearch after update ubuntu and java (#28349)
If anyone is working with GCP instances, using Ubuntu 16.04 LTS, I downgraded the kernel from:
uname -a
Linux elasticsearch-1-vm 4.13.0-1007-gcp #10-Ubuntu SMP Fri Jan 12 13:56:47 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
To:
uname -a
Linux elasticsearch-1-vm 4.13.0-1006-gcp #9-Ubuntu SMP Mon Jan 8 21:13:15 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
To fix the issue with the GCP instances, I ran:
sudo apt remove 4.13.0-1007-gcp
exit
Then in google cloud console, restart the instance, then SSH back in then:
sudo apt install 4.13.0-1006-gcp
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Felastic%2Felasticsearch%2Fissues%2F28349%23issuecomment-360570676&data=02%7C01%7Cjopoulso%40microsoft.com%7C0d7a5f1b3d654524bfb808d564287e39%7Cee3303d7fb734b0c8589bcd847f1c277%7C1%7C0%7C636525047438700600&sdata=8zHzyXZytEMVGo5z0jCxHhr55KbtPOLyG3cegJXeoGQ%3D&reserved=0, or mute the threadhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FADpbJT9FLH_dAuoGFYXcJ9rxeVNopfWXks5tONOigaJpZM4Rqrje&data=02%7C01%7Cjopoulso%40microsoft.com%7C0d7a5f1b3d654524bfb808d564287e39%7Cee3303d7fb734b0c8589bcd847f1c277%7C1%7C0%7C636525047438700600&sdata=IjqRKYsxIa75yAImB4VnRFGiQBRQAKkV7vv%2Fee6iErg%3D&reserved=0.
FYI, latest Ubuntu 17.10 kernel seems to have this fixed again via Kernel version 4.13.0-32
Most helpful comment
We're looking into this, the linux-azure kernel also has a similar issue. The Meltdown and Spectre changes may have introduced a regression.