Thank you @xiaoxiaofengzi for the clarification!
Unfortunately this is a known problem caused by that MMLSpark uses old and broken (in terms of dependencies) version of LightGBM.
You can track this PR and these issues 1, 2 to know when the problem will be solved on the Spark-package side. I think that you can copy-paste your log there to hurry them gently.
From the our side starting from 2.2.2 version we automatically guarantee GLIBC <= 2.14 and GLIBCXX <= 3.4.19:
https://github.com/Microsoft/LightGBM/blob/a694712b7fb86cd532eea2c1781b58d4ba58436a/helpers/check_dynamic_dependencies.py#L18-L23
https://github.com/Microsoft/LightGBM/blob/a694712b7fb86cd532eea2c1781b58d4ba58436a/helpers/check_dynamic_dependencies.py#L25-L31
Thanks for your patience and sorry for the inconvenience!
_Originally posted by @StrikerRUS in https://github.com/Microsoft/LightGBM/issues/1858#issuecomment-440610638_
lightGBM 2.2.2 still can not run in older system (centOS 7):

run jar (lightGBMlib):exception
Caused by: java.lang.UnsatisfiedLinkError: /tmp/mml-natives1380894375751123942/lib_lightgbm.so: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by /tmp/mml-natives1380894375751123942/lib_lightgbm.so)
my environment:
Linux tw-node41 3.10.0-327.el7.x86_64 #1 SMP Thu Nov 19 22:10:57 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
ldd (GNU libc) 2.17
=======================================
i could compile the lightGBM source in this environment successfully,
but get the same exception when i run the example in folders.

The dependency is based on your build tool chain. You should use gcc-4.8 .

my gcc version is 4.8
https://github.com/Azure/mmlspark/issues/335#issuecomment-451105058 other people also encounter this issue.
@puyvqi
can you try to bulid by this docker https://github.com/Microsoft/LightGBM/blob/master/.vsts-ci.yml#L7 ?
i know ubuntu 14.04 and above can run successfully, i just wanna know if it can run in centos 7
it is very strange. if the build machine and execute machine are the same, it should have no dependency issues.
yes ,so wired ,compile successfully, but run failed, in the same machine, the glibc issue is gone , but GLIBCXX_3.4.20 comes in the run time.my environment only has GLIBCXX_3.4.19 and below.
@puyvqi MMLSpark-team released 0.16 version a day ago with new LightGBM binaries. You can try it.
Also ping @imatiach-msft for any thoughts.
@imatiach-msft
Hello,I am so sorry to tell that I faced the same problems when I use released 0.16 version on Centos 7.
Could you rebuild the 0.16 version on CentOs 7 and provide me the maven address to download?
<dependency>
<groupId>Azure</groupId>
<artifactId>mmlspark</artifactId>
<version>0.16</version>
</dependency>
19/04/08 18:07:01 WARN scheduler.TaskSetManager: Lost task 2.0 in stage 71.0 (TID 8298, bigdata.node4, executor 3): java.lang.UnsatisfiedLinkError: /data/cdh/yarn/nm/usercache/hadoop/appcache/application_1551665533369_5974/container_1551665533369_5974_01_000004/tmp/mml-natives5619078166263179450/lib_lightgbm.so: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by /data/cdh/yarn/nm/usercache/hadoop/appcache/application_1551665533369_5974/container_1551665533369_5974_01_000004/tmp/mml-natives5619078166263179450/lib_lightgbm.so)
at java.lang.ClassLoader$NativeLibrary.load(Native Method)
at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941)
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824)
at java.lang.Runtime.load0(Runtime.java:809)
at java.lang.System.load(System.java:1086)
at com.microsoft.ml.spark.NativeLoader.loadLibraryByName(NativeLoader.java:59)
at com.microsoft.ml.spark.LightGBMUtils$.initializeNativeLibrary(LightGBMUtils.scala:38)
at com.microsoft.ml.spark.TrainUtils$$anonfun$5.apply(TrainUtils.scala:208)
at com.microsoft.ml.spark.TrainUtils$$anonfun$5.apply(TrainUtils.scala:205)
at com.microsoft.ml.spark.StreamUtilities$.using(StreamUtilities.scala:29)
at com.microsoft.ml.spark.TrainUtils$.trainLightGBM(TrainUtils.scala:204)
at com.microsoft.ml.spark.LightGBMClassifier$$anonfun$3.apply(LightGBMClassifier.scala:83)
at com.microsoft.ml.spark.LightGBMClassifier$$anonfun$3.apply(LightGBMClassifier.scala:83)
at org.apache.spark.sql.execution.MapPartitionsExec$$anonfun$5.apply(objects.scala:188)
at org.apache.spark.sql.execution.MapPartitionsExec$$anonfun$5.apply(objects.scala:185)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:830)
at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:830)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Just for reference:
seems that libraries compiled on Ubuntu very often are incompatible for CentOS
https://github.com/dmlc/xgboost/pull/4302#issuecomment-477403940
Is the https://mvnrepository.com/artifact/com.microsoft.ml.lightgbm/lightgbmlib artifact owned by this project or MMLSpark?
If you download the latest artifact (2.2.300) and do a strings search on it, you can see that it still have a reference to GLIBCXX_3.4.20.
@pawitp I created those maven artifacts. It looks like I shouldn't be compiling the code on ubuntu, which I am doing currently. Maybe I should compile the binaries in centos.
@pawitp It's owned by MMLSpark
@imatiach-msft Hi!
Which Ubuntu do you use?
We deployed docker with Ubuntu 14 to avoid GLIBCXX_3.4.20 dependency, which comes from Ubuntu 16.
You can always download latest master binaries from this docker from the link in badge, which you can find at the top of installation guide page (it's constructed dynamically by this JS script, therefore I cannot post a direct link here):

https://lightgbm.readthedocs.io/en/latest/Installation-Guide.html
I think we can compile with -DUSE_SWIG=ON flag inside this docker too and provide the needed artifacts with the aim to free you from manual compiling. But this will require your help with JAVA and SWIG installation.
hi @StrikerRUS
I'm currently running my ubuntu machine in hyperv on my windows laptop. The version is:
Ubuntu 16.04.4 LTS
"I think we can compile with -DUSE_SWIG=ON flag inside this docker too"
That would be amazing! We would probably also need to setup Java and set the JAVA_HOME environment variable, as well as install SWIG on the docker.
I'm not sure how to get started. I guess I just need to modify that docker file and test running it.
I don't see where in the docker file you pull the latest lightgbm code and compile it. How is that done? Ah, I guess it is done here:
https://github.com/Microsoft/LightGBM/blob/master/.ci/test.sh
It looks like I would have to modify that file?
The version is:
Ubuntu 16.04.4 LTS
Oh, that's the root cause of glibc issue! At least, now we know how the problem can be solved.
That docker is pulled automatically by Azure Pipelines by the following lines:
https://github.com/Microsoft/LightGBM/blob/c56412a859d4968f2b720514306be3404552b385/.vsts-ci.yml#L4-L7
https://github.com/Microsoft/LightGBM/blob/c56412a859d4968f2b720514306be3404552b385/.vsts-ci.yml#L16
The bash script you've referenced (along with this one) is used to install additional software and run tests after entering inside docker.
I think we can work in the following manner. Can you please post here a brief guide how to setup the needed versions of JAVA and SWIG inside the docker (you can play with it locally: https://hub.docker.com/r/lightgbm/vsts-agent) from scratch? After that I'll modify this docker.
@guolinke Don't you mind setup one more Azure Pipelines job for SWIG, at least for Linux and without any tests? Only compilation. I suppose it won't take much time. We had so many users suffering from GLIBC issues!
I'm sorry, I'm very far from JAVA world, but it seems to me that we'll have some troubles with latest JAVA installation:
https://launchpad.net/~webupd8team/+archive/ubuntu/java
The Oracle JDK License has changed for releases starting April 16, 2019.
The new Oracle Technology Network License Agreement for Oracle Java SE is substantially different from prior Oracle JDK licenses. The new license permits certain uses, such as personal use and development use, at no cost -- but other uses authorized under prior Oracle JDK licenses may no longer be available. Please review the terms carefully before downloading and using this product. An FAQ is available here: https://www.oracle.com/technetwork/java/javase/overview/oracle-jdk-faqs.html
Oracle Java downloads now require logging in to an Oracle account to download Java updates, like the latest Oracle Java 8u211 / Java SE 8u212. Because of this I cannot update the PPA with the latest Java (and the old links were broken by Oracle).
For this reason, THIS PPA IS DISCONTINUED (unless I find some way around this limitation).
https://dev.to/lemuelogbunude/java-is-still-free-286
Is OpenJDK a workaround?
For normal usage, OpenJDK is fully compatible with Oracle JDK, so yes, there should be no issue with it.
EDIT: I recommend using OpenJDK 8 for widest compatibility.
@guolinke Can you please answer to the question in last paragraph https://github.com/Microsoft/LightGBM/issues/1945#issuecomment-482305298? Your answer may make all further work pointless.
@StrikerRUS sure, i think it is okay.
@StrikerRUS I've made a rough first attempt at getting this to work here:
https://github.com/Microsoft/LightGBM/pull/2124
basically, we just want to:
1.) Install SWIG
2.) Install Java
3.) Setup Java_HOME
4.) Run the build command with -DUSE_SWIG=on
How could I verify/debug my changes? Is there an easy way I could run the setup script & docker and look at the outputs?
Also, it would be nice to do something similar for windows and macos.
@imatiach-msft Thanks!
You can install Docker on your machine and pull our image from here https://hub.docker.com/r/lightgbm/vsts-agent. As we are going to do all installation things inside docker for speedup, scripts will not be used, except one command to compile LightGBM with SWIG flag -DUSE_SWIG=on.
I'll try to update the docker today based on commands from your PR and report about results.
Also, it would be nice to do something similar for windows and macos.
Let's start from Linux, OK? 😃 For macOS I suppose brew will help a lot.
@StrikerRUS it looks like the builds for the PR failed for some reason - it seems that Java was not installed properly even with my apt-get install command
@imatiach-msft Don't worry! I'll create a PR within an hour. Right now it compiles against Clang. Right after this test I'll ping you for a review.
"As we are going to do all installation things inside docker for speedup, scripts will not be used,"
Confused, does this mean we manually setup the docker image? Shouldn't the commands actually go in the dockerfile to initialize the docker image? To me it seems it would be better to add the commands to update Java and build with SWIG to the setup scripts I modified in my PR. Also, it would be nice to have a Travis build with some Java tests for the Java API running in lightgbm repo eventually (as currently all testing of Java wrappers is done from mmlspark side).
@StrikerRUS thanks, the PR looks great. Does this mean build artifacts will be created on each build? I tried to find the build artifacts on the build for that PR (to look at the .so files and jar) but couldn't find them, although it looks like this line should have published something:
cp $BUILD_DIRECTORY/build/lightgbmlib.jar $BUILD_ARTIFACTSTAGINGDIRECTORY/lightgbmlib.jar
@imatiach-msft Thanks for your review! As I said in my comment here https://github.com/Microsoft/LightGBM/pull/2125#issuecomment-485310784, artifacts are not published for PRs. They are published for each commit in any branch, but not for PRs. As I worked in docker branch for that PR, you can find jar artifact in builds for this branch.
https://github.com/Microsoft/LightGBM/blob/d115769c2a2ddffadc76c7b84739a47937114c77/.vsts-ci.yml#L56-L61


https://dev.azure.com/lightgbm-ci/lightgbm-ci/_build/results?buildId=1856
Hi there, this is really exciting work! I have encountered the same error on my centos7 VM and really appreciate that there is a solution out there right now. Since the PR has not been officially merged into master right now, if I want to use this most recent update in my lightGBM model, do I just use the lightgbmlib.jar in the package asset folder you mentioned above? Many thanks!
@XinyunTang Hi!
Sure, you can use jar file from that PR's artifacts. Here is the direct link: https://dev.azure.com/lightgbm-ci/8461a79b-5dce-4085-ad70-4410b7135276/_apis/build/builds/1856/artifacts?artifactName=PackageAssets&api-version=5.1-preview.5&%24format=zip. After the merge, there will be possible to download jar file with the latest code directly via the artifacts download badge on this page.
We hope that the compilation on Ubuntu 14 should solve the incompatibility problems on CentOS. Your feedback is very welcome as well!
@imatiach-msft Everything has been updated finally. JAR file is now in PackageAssets artifact.
Most helpful comment
@imatiach-msft Thanks for your review! As I said in my comment here https://github.com/Microsoft/LightGBM/pull/2125#issuecomment-485310784, artifacts are not published for PRs. They are published for each commit in any branch, but not for PRs. As I worked in
dockerbranch for that PR, you can find jar artifact in builds for this branch.https://github.com/Microsoft/LightGBM/blob/d115769c2a2ddffadc76c7b84739a47937114c77/.vsts-ci.yml#L56-L61
https://dev.azure.com/lightgbm-ci/lightgbm-ci/_build/results?buildId=1856