Elasticsearch: Can't running elasticsearch after update ubuntu and java

Created on 24 Jan 2018  Â·  25Comments  Â·  Source: elastic/elasticsearch

Describe the feature: I can't running elasticsearch after update ubuntu and java

Elasticsearch version (bin/elasticsearch --version): 2.4.1

Plugins installed: []

JVM version (java -version): OpenJDK 1.8.0_151

OS version (uname -a if on a Unix-like system): Linux 4.13.0-31-generic #34~16.04.1-Ubuntu x86_64 GNU/Linux

Description of the problem including expected versus actual behavior: Normally I can run elasticsearch with the ./elasticsearch command. But this morning, after updating ubuntu and java, I was not able to boot up elasticsearch anymore.

feedback_needed

Most helpful comment

We're looking into this, the linux-azure kernel also has a similar issue. The Meltdown and Spectre changes may have introduced a regression.

All 25 comments

Please describe the problem you are having, otherwise it's very hard to help.

I have the same problem since OS update this morning. Befor the update all was fine.

Elasticsearch version (bin/elasticsearch --version): 2.4.6

JVM version (java -version): 1.8.0_131

OS version (uname -a if on a Unix-like system): Linux dellXPS 4.13.0-31-generic #34-Ubuntu 17.10

Description of the problem including expected versus actual behavior:
./elasticsearch did nothing. No message, no log entry.

I noticed that with the update also an update of intels microcode was installed:
intel-microcode 3.20180108.0+really20170707ubuntu17.10.1

can you folks give us some logging information otherwise everything is a shot in the dark

There is no logging anymore. I start ./elasticsearch in a terminal and after a second it comes back without any messages. Logfiles are empty also with DEBUG.
I also tried without any java params: /usr/lib/jvm/java-8-oracle/bin/java -Dfile.encoding=UTF-8 -Des.path.home=/home/xxx/elasticsearch-2.4.6 -cp /home/xxx/programs/elasticsearch-2.4.6/lib/elasticsearch-2.4.6.jar:/home/xxx/elasticsearch-2.4.6/lib/* org.elasticsearch.bootstrap.Elasticsearch start without message

Sorry - when I started writing my bug report there was only the first entry and I wasn't sure this was the same bug. But looking at the two more recent entries, I think it is. I do have a related system log error. See https://github.com/elastic/elasticsearch/issues/28354

I think you should check your kernel log messages (e.g., dmesg).

Yep, this is the same issue with the same kernel output.
So, meltdown has killed my ES!

Sorry, this is not an Elasticsearch issue then. I suggest rolling back your kernel.

I also suspect the microcode update from intel is involved.

Thank for your support! But I used docker elasticsearch instead, hoping to fix the kenel soon.

The kernel is that of your underlying host, it does not have anything to do with the image.

Running Linux with last kernel 4.13.0-25-generic and ES works fine again!

Same for us, 4.13.0-25-generic works, upgrading to 4.13.0-31-generic fails.

The crash is caused by a non-existant syscall, which has been removed in ES 5.6.4
This is the commit that fixes the bug: https://github.com/elastic/elasticsearch/commit/b1d5e85dc4b4e52c6c1bbc62996eaca83ed2e96d

Any chance to have this fix in ES 2?

Any chance to have this fix in ES 2?

we won't release another version of 2.x

I am sorry, but that is not the issue. It might be a workaround for the issue (I have not verified), but it is not the bug. The bug is the kernel here. The error message is that the kernel tried to execute a page marked with NX, the kernel should never ever do that. Invoking a non-existant system call should not result in the kernel trying to execute a page marked NX, ever. Instead, invoking a non-existant syscall should return -ENOSYS (see sys_ni_syscall in the kernel execution of syscalls) (except in Docker where the process gets a SIGSYS if a system call filter catches the system call).

We're looking into this, the linux-azure kernel also has a similar issue. The Meltdown and Spectre changes may have introduced a regression.

Thanks @shirgall.

@chaudum indeed, I gotta say that upgrading to 5.6.4 fixed the problem.

No it didn’t. You have a buggy kernel, how can you trust it to do anything correctly?

If anyone is working with GCP instances, using Ubuntu 16.04 LTS, I downgraded the kernel from:

uname -a
Linux elasticsearch-1-vm 4.13.0-1007-gcp #10-Ubuntu SMP Fri Jan 12 13:56:47 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

To:

uname -a
Linux elasticsearch-1-vm 4.13.0-1006-gcp #9-Ubuntu SMP Mon Jan 8 21:13:15 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

To fix the issue with the GCP instances, I ran:

sudo apt remove 4.13.0-1007-gcp
sudo apt install 4.13.0-1006-gcp
exit

Then in google cloud console, restart the instance, then SSH back in then:
sudo service elasticsearch start

And on the Azure side, the “linux-azure” kernel 4.13.0-1007 was put out today and seems to have fixed this issue. Ubuntu updated all of the other kernels as well.

Thanks, --jrp

From: deppi notifications@github.com
Sent: Thursday, January 25, 2018 11:19
To: elastic/elasticsearch elasticsearch@noreply.github.com
Cc: Josh Poulson jopoulso@microsoft.com; Mention mention@noreply.github.com
Subject: Re: [elastic/elasticsearch] Can't running elasticsearch after update ubuntu and java (#28349)

If anyone is working with GCP instances, using Ubuntu 16.04 LTS, I downgraded the kernel from:

uname -a

Linux elasticsearch-1-vm 4.13.0-1007-gcp #10-Ubuntu SMP Fri Jan 12 13:56:47 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

To:

uname -a

Linux elasticsearch-1-vm 4.13.0-1006-gcp #9-Ubuntu SMP Mon Jan 8 21:13:15 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

To fix the issue with the GCP instances, I ran:

sudo apt remove 4.13.0-1007-gcp

exit

Then in google cloud console, restart the instance, then SSH back in then:
sudo apt install 4.13.0-1006-gcp

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Felastic%2Felasticsearch%2Fissues%2F28349%23issuecomment-360570676&data=02%7C01%7Cjopoulso%40microsoft.com%7C0d7a5f1b3d654524bfb808d564287e39%7Cee3303d7fb734b0c8589bcd847f1c277%7C1%7C0%7C636525047438700600&sdata=8zHzyXZytEMVGo5z0jCxHhr55KbtPOLyG3cegJXeoGQ%3D&reserved=0, or mute the threadhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FADpbJT9FLH_dAuoGFYXcJ9rxeVNopfWXks5tONOigaJpZM4Rqrje&data=02%7C01%7Cjopoulso%40microsoft.com%7C0d7a5f1b3d654524bfb808d564287e39%7Cee3303d7fb734b0c8589bcd847f1c277%7C1%7C0%7C636525047438700600&sdata=IjqRKYsxIa75yAImB4VnRFGiQBRQAKkV7vv%2Fee6iErg%3D&reserved=0.

FYI, latest Ubuntu 17.10 kernel seems to have this fixed again via Kernel version 4.13.0-32

Was this page helpful?
0 / 5 - 0 ratings

Related issues

brwe picture brwe  Â·  3Comments

rjernst picture rjernst  Â·  3Comments

jasontedor picture jasontedor  Â·  3Comments

Praveen82 picture Praveen82  Â·  3Comments

matthughes picture matthughes  Â·  3Comments