Elasticsearch: Elasticsearch fails to start: Failure running machine learning native code

Created on 24 Oct 2019  路  11Comments  路  Source: elastic/elasticsearch

Elasticsearch version (bin/elasticsearch --version):

Version: 7.4.1, Build: default/tar/fc0eeb6e2c25915d63d871d344e3d0b45ea0ea1e/2019-10-22T17:16:35.176724Z, JVM: 11.0.4

Plugins installed: []

JVM version (java -version):

openjdk version "11.0.4" 2019-07-16 LTS
OpenJDK Runtime Environment Corretto-11.0.4.11.1 (build 11.0.4+11-LTS)
OpenJDK 64-Bit Server VM Corretto-11.0.4.11.1 (build 11.0.4+11-LTS, mixed mode)

Also happens under the bundled JDK:

openjdk version "13" 2019-09-17
OpenJDK Runtime Environment AdoptOpenJDK (build 13+33)
OpenJDK 64-Bit Server VM AdoptOpenJDK (build 13+33, mixed mode, sharing)

OS version (uname -a if on a Unix-like system):

Darwin 18.7.0 Darwin Kernel Version 18.7.0: Thu Jun 20 18:42:21 PDT 2019; root:xnu-4903.270.47~4/RELEASE_X86_64 x86_64

MacOS 10.14.6

Description of the problem including expected versus actual behavior:

When attempting to start elasticsearch (either directly or via brew services start elasticsearch-full), part way into startup, a "Problem Report for controller" window pops up, then Elasticsearch fails with the message "Failure running machine learning native code. This could be due to running on an unsupported OS or distribution, missing OS libraries, or a problem with the temp directory. To bypass this problem by running Elasticsearch without machine learning functionality set [xpack.ml.enabled: false]."

Problem Report window reports

Exception Type:        EXC_BAD_ACCESS (Code Signature Invalid)
Exception Codes:       0x0000000000000032, 0x000000010975e000
Exception Note:        EXC_CORPSE_NOTIFY

Termination Reason:    Namespace CODESIGNING, Code 0x2

Steps to reproduce:

Please include a minimal but complete recreation of the problem, including
(e.g.) index creation, mappings, settings, query etc. The easier you make for
us to reproduce it, the more likely that somebody will take the time to look at it.

  1. Clean Install via Homebrew (brew tap elastic/tap && brew install elastic/tap/elasticsearch-full)

    1. Start Elasticsearch

The .tar.gz downloaded from the elastic download page works fine. If I copy everything into the homebrew install directory and reuse the same config, it works fine. It's just when installed via homebrew.

Workaround:

  1. Can be worked around by setting x-pack-ml.enabled: false, though this still produces the Problem Report window.
  2. Replace contents of homebrew libexec (except config and plugins) with files extracted from standard download

Provide logs (if relevant):

elastic.trace.log
problem report.log

:ml >bug

Most helpful comment

The workaround that I have been using thus far has been to replace everything in $(brew --prefix elasticsearch-full)/libexec except for the config and plugin symlinks with files pulled from the standard .tar.gz.

I've added this workaround to the original post so it can be more readily seen.

All 11 comments

I have the same issue

Additional Info: It appears that the version of x-pack-ml/platform/darwin-x86_64/controller.app installed by Homebrew has a different checksum from the version manually downloaded.
Tested via

zip controller.app.zip controller.app/*
openssl sha256 controller.app.zip

Homebrew: SHA256(controller.app.zip)= f1162c7fdbfa89603ffcc3b0142e8bc37a2de0ffa2bcddf6a764066bf23a1ef4
Manual: SHA256(controller.app.zip)= 459ea3f9f85140efb73ab1757ec090ef5a414f12727d8d9420dd3a53c68c6657

The tar.gz archives downloaded via manual and Homebrew both have the same checksums, but for some reason, this controller does not.

I could reproduce this as well and have dug a bit more into that issue. If we apply a sha256 to each of the files in the ML controller app, we get the following for the Homebrew installation:

d3099bfe0e35c63d3f4fb27344299fb2301648adde48bfecd3a13aae5930c61e  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/CodeResources
893cf12d6ef1886cc86325d954bea6cf0cefb164fbbdd9ca5ae71515a9ecd759  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/Info.plist
15b45e482b5317935f05152df1a62e0c0314d6d316b231f23a413600de514e3b  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/MacOS/autoconfig
00010a08015bd3d38e3e2008d61c66dfebf3ee1733c866909a82014cfc676aee  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/MacOS/autodetect
ea2f1bf0788cb855bffa0665a3fca5281b41f5867b3effe8d1522323ffb48336  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/MacOS/categorize
fd65b79e85e7c3a0d1da7c5c7da9b329a01728cf2dfe8b68619ace18e8814532  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/MacOS/controller
986822bb7444aa5fd74a2caab0eb234819971c5b699fd22ad9896135e4843643  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/MacOS/data_frame_analyzer
f861e64db133a088d7e68bd881048972d2292b5440f9884ecb87bded5d136b90  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/MacOS/normalize
941acd3c08a518e58d5155d57506349133f765b4288682efcd4c41dadf51d435  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/Resources/ml-en.dict
643765bfc3cbe3c86a6d93948c4a99db5c7c6381c68fecf905018fb6e78e18b7  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/_CodeSignature/CodeResources
e3bc3179e9ddf502c0eb029d29246abf8fe70da3e95722667383296eaf0295f0  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/lib/libMlApi.dylib
0fa27fec6f4630a91e11828ae6058c02272beb3ab13c7502c5f918cd1914cbc2  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/lib/libMlConfig.dylib
1676b2a2df1cb8688fd70f969e912ec9a000cca4aa71fb1d3b4f240a4f724008  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/lib/libMlCore.dylib
838a7587a8e0e26832f39edda6837ae70085aaaa15b1533a531314a2670432b8  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/lib/libMlMaths.dylib
3bf9dc09c51807bfdf90526675ba184eae533f677c2326ead1ac551a6d1e3a75  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/lib/libMlModel.dylib
1d86819bdc8b7d5cf09aba39f2978046051ffeab5a0d150318b5b90ddb4bf17a  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/lib/libboost_date_time-clang-darwin42-mt-1_65_1.dylib
bc727f415f2d595ae57a0180ad6559e5f691f9074b265b790eed73c8c1d416ce  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/lib/libboost_filesystem-clang-darwin42-mt-1_65_1.dylib
d561f1e399d7a42f0b1de091aad34ce27a188ac5b0f8e947921095ed336e8365  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/lib/libboost_iostreams-clang-darwin42-mt-1_65_1.dylib
091b4f471c4acfa40b98ca56b57bdcfd64b11b5d1b7ae60d55ce2f220a5a8cb4  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/lib/libboost_program_options-clang-darwin42-mt-1_65_1.dylib
a748fe17e32a575771696695928b68b4488d88ce6b4fa2608d777fb872e8ab8c  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/lib/libboost_regex-clang-darwin42-mt-1_65_1.dylib
54d0556e3b8ae7f49fef575d61962eebf7c972aed0203b4867e7c453c7481e09  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/lib/libboost_system-clang-darwin42-mt-1_65_1.dylib
45dfc850d5da8f767dfb95bd5a800e5c1b6751eca8e123b05e71b6f380d6932a  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/lib/libboost_thread-clang-darwin42-mt-1_65_1.dylib
c98ce08df94e9b8a5f54db4bf69f2261704f386ec123431183804b7d9ede38c8  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/lib/liblog4cxx.10.dylib

If we do the same for the .tar.gz, we get:

d3099bfe0e35c63d3f4fb27344299fb2301648adde48bfecd3a13aae5930c61e  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/CodeResources
893cf12d6ef1886cc86325d954bea6cf0cefb164fbbdd9ca5ae71515a9ecd759  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/Info.plist
15b45e482b5317935f05152df1a62e0c0314d6d316b231f23a413600de514e3b  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/MacOS/autoconfig
00010a08015bd3d38e3e2008d61c66dfebf3ee1733c866909a82014cfc676aee  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/MacOS/autodetect
ea2f1bf0788cb855bffa0665a3fca5281b41f5867b3effe8d1522323ffb48336  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/MacOS/categorize
fd65b79e85e7c3a0d1da7c5c7da9b329a01728cf2dfe8b68619ace18e8814532  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/MacOS/controller
986822bb7444aa5fd74a2caab0eb234819971c5b699fd22ad9896135e4843643  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/MacOS/data_frame_analyzer
f861e64db133a088d7e68bd881048972d2292b5440f9884ecb87bded5d136b90  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/MacOS/normalize
941acd3c08a518e58d5155d57506349133f765b4288682efcd4c41dadf51d435  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/Resources/ml-en.dict
643765bfc3cbe3c86a6d93948c4a99db5c7c6381c68fecf905018fb6e78e18b7  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/_CodeSignature/CodeResources
5c16bf3ed32d081c016e197570d889772ed45cd0b71d49af1748a267ec970b58  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/lib/libMlApi.dylib
8c28525ffa6b7f74ca2bdcfaf417e41198df94dc1782cbda1acb3aeb895070bc  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/lib/libMlConfig.dylib
dd576ddb2705ab8d13adfe7df35aa9e5da724cf78077f0a03e3863b9d27752ab  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/lib/libMlCore.dylib
63bae92bcd3adb3e26457313a31411cee30dd11cac7580008d7d059fad8a59ac  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/lib/libMlMaths.dylib
0096bdb60f1cba47f968ada384597d7ed607e9e2a1ee5e9eeb9f9b28a336eb0b  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/lib/libMlModel.dylib
f6ef93c418c9184f9932129d36253162437767f806ce55ddb07bb410639450cd  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/lib/libboost_date_time-clang-darwin42-mt-1_65_1.dylib
9590066ef8fdd2bfdff381f7764c0e400f21d3f323f2dcf9e9c48fb6c9e8122a  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/lib/libboost_filesystem-clang-darwin42-mt-1_65_1.dylib
ba7d5dfa030912429067401488fb2ba5931d5954275699e3b9a8bcff3ff3639c  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/lib/libboost_iostreams-clang-darwin42-mt-1_65_1.dylib
e680570e9d9673c657e594a1c4fa3657ee74702054a60bd7292aca675849ff6f  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/lib/libboost_program_options-clang-darwin42-mt-1_65_1.dylib
af377de7c6b32bf8fc2646f90e824f9a1411f4c25f5fba6e3bb382a28be1186e  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/lib/libboost_regex-clang-darwin42-mt-1_65_1.dylib
a1f05b181a6e372549546f8c8566252a8899e3e85e520fac9f237839f5ea4a72  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/lib/libboost_system-clang-darwin42-mt-1_65_1.dylib
2f86dac4214f8ce934feaceb24b14ded8b465b66dbae2bb9d6c5ef2230ffdf86  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/lib/libboost_thread-clang-darwin42-mt-1_65_1.dylib
8547be29c291038bc29571a05507325e5275533a066f2b321e599089c33c85f3  modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/lib/liblog4cxx.10.dylib

Notice that all files in lib have a different checksum. Inspecting the two files with a hex editor (I have used bvi) shows some differences. Whereas in the .dylib from the .tar.gz, we see relative paths like @rpath/libMlApi.dylib, these are resolved to absolute paths in the binary installed via Homebrew: /usr/local/opt/elasticsearch-full/libexec/modules/x-pack-ml/platform/darwin-x86_64/controller.app/Contents/lib/libMlApi.dylib which means that the dylib file has indeed been changed.

I have also checked 7.4.0 but there we don't have this problem because the the native bits of the x-pack-ml modules have not been packaged as an .app.

Pinging @elastic/ml-core (:ml)

This might also be related to #46498.

This _might_ also be related to #46498.

Yes, it most definitely is. As part of the work for that we have to get Apple to notarize the ML C++ executables. One requirement for this is to sign them. (Another is the directory structure change from elastic/ml-cpp#593, but that's not the direct cause of this problem.) The problem is that on installation Homebrew rewrites the rpaths from relative to absolute, invalidating the signatures. When Homebrew messes with the binary contents it should also really remove signatures.

I guess the workaround until then is to manually remove the signatures - see https://reverseengineering.stackexchange.com/questions/13622/remove-code-signature-from-a-mac-binary

One way we could fix this in the future would be to change where in the release process the Homebrew artifacts are created. I believe at the moment it's:

build unsigned tar.gz artifact -> sign -> notarize -> upload to download site
                                                   \> extract the bits required by Homebrew

But if it were:

build unsigned tar.gz artifact -> sign -> notarize -> upload to download site
                               \> extract the bits required by Homebrew

then this problem wouldn't happen.

Also 7.4.2 is currently planned to ship with a JDK build where the executables are signed - I wonder if that will have the same problem?

/cc @jasontedor and @Conky5 since it relates to Homebrew and notarization.

The workaround that I have been using thus far has been to replace everything in $(brew --prefix elasticsearch-full)/libexec except for the config and plugin symlinks with files pulled from the standard .tar.gz.

I've added this workaround to the original post so it can be more readily seen.

On OSX installed with Homebrew
Add: xpack.ml.enabled: false
To: /usr/local/etc/elasticsearch/elasticsearch.yml
And restart: brew services restart elastic/tap/elasticsearch-full

Yes, please see the workaround section in the original issue. We're aware that there are workarounds, however this appears to be a legitimate issue that needs to be fixed. Workarounds are not fixes, they are bandaids.

Also, worth noting, that workaround does not completely resolve the issue. It still attempts to load the x-pack-ml module, which in turn loads up controller.app which has an invalidated signature. This causes Mac OS to display a problem report window indicating that the signature was invalid.

Thanks for the report, and your patience as we worked through this issue. We have opened elastic/homebrew-tap#20 which should address it.

It have recoverd after I delete logs/*.

Was this page helpful?
0 / 5 - 0 ratings