Incubator-mxnet: MXNet for Scala 2.12 and 2.13

Created on 2 Jul 2020  路  15Comments  路  Source: apache/incubator-mxnet

The Scala language binding is badly needed for high performance training/inference around Apache Spark. The current Scala language binding is for 2.11 and the MXNet version is 1.5.1.

Please add 2.12 and 2.13 packages for 1.6.0 and up. TF has eaplatanios/tensorflow_scala, which while being a non-official release, has a lot of traction and builds for both 2.12 and 2.13.

Call for Contribution Feature request v1.x

Most helpful comment

I have some instruction in this repository https://github.com/cosmincatalin/mxnet-compiler. I use a custom made Docker image to compile a linux based MXNet library and then I compile the Scala 2.11 binding. The image has Java 8 baked in.
Roughly the same procedure can be used to generate 2.12 bindings, by modifying some Maven configs and libraries. But I don't know how to generate both 2.11, 2.12 and 2.13 bindings.

All 15 comments

I would like to contribute if anyone can give me a few pointers of what is needed to have this happen.

@cosmincatalin have you tried building from source https://github.com/apache/incubator-mxnet/tree/v1.x/scala-package#build-from-source?

Are you interested in the CPU or GPU packages? There are some licensing issues with the binaries at org.apache.mxnet (GPU packages may be subject to CUDA EULA and thus incompatible with Apache 2 License; CPU packages redistribute libquadmath.so which is GPL and thus incompatible with Apache 2 License). The latter can be easily fixed by not putting libquadmath.so into the jar. So if you are interested in having the CPU packages for 1.6 (and 1.7 release) and Scala 2.12 and 2.13, we can discuss more about how to make it happen. The first step is to verify the build-from-source works locally for 2.12 and 2.13.

For the GPU packages, it depends on NVidia. They have internal discussions considering if they'd be able to make their EULA compatible with Apache License 2.

You can also refer to https://issues.apache.org/jira/browse/INFRA-20442 for more information.

I've tinkered a little with building from source, but wasn't very successful, I guess I need to focus on it more. To answer your second question, yes, I am interested specifically in the CPU packages.

hey, I'm interested on this, count me in if you want help @cosmincatalin

hey @leezu , I got to compile the project using make. But when I run mvn compile on the Scala folder I get this error:

[INFO] Compiling 2 source files to /home/gustavo/git/apache-mxnet-src-1.6.0-incubating/scala-package/init/target/classes at 1597332091507
[INFO] compiler plugin: BasicArtifact(org.scalamacros,paradise_2.11.8,2.1.0,null)
[ERROR] error: scala.reflect.internal.MissingRequirementError: object java.lang.Object in compiler mirror not found.

then I realized that the pom.xml file is configured like this:
<java.version>1.7</java.version>

I have Java 11 installed and that may be the source of the error above, would that be correct?
Also, is there any reason to set the java.version variable to 1.7?

@tavoaqp I think that's right. cc @lanking520 who's helping with a build instruction and planning to share past experiences on upgrade like this.

I have some instruction in this repository https://github.com/cosmincatalin/mxnet-compiler. I use a custom made Docker image to compile a linux based MXNet library and then I compile the Scala 2.11 binding. The image has Java 8 baked in.
Roughly the same procedure can be used to generate 2.12 bindings, by modifying some Maven configs and libraries. But I don't know how to generate both 2.11, 2.12 and 2.13 bindings.

instead of build from source, you can get the pip wheel for mxnet and put the so in the lib folder.

I would suggest start with mvn verify to see if it can run successfully. I have been trying to upgrade to 2.12 last year but failed due to some dependencies mismatch issue. If you can get over them it could be ok.

Another beast in the code is the code generation system. I am not sure if 2.12 or 2.13 would have consistent support on quasiquote to get it work, but worth for a try. Finally is the Spark support, you may need to change some code to get Spark fully support there.

thanks @lanking520 ! seems like a lot of work :smile: , @cosmincatalin I will try your setup!

@lanking520 Yeah, that works too. I tried with so's from the wheel, and that worked. That being said, 2.12 and 2.13 are a must since a lot Scala code is now on 2.12 at least. Spark is now on 2.12. It would have been nice if the whole setup was based on sbt rather than maven

hey @lanking520 I got some progress: I switched to Scala 2.12.12 and fixed the dependencies. Everything compiles (with some warnings though) but when it comes to compile the examples project I get this:

[INFO] --- scala-maven-plugin:3.4.4:doc-jar (compile) @ mxnet-examples ---              
/home/gustavo/git/incubator-mxnet/scala-package/examples/src/main/java/org/apache/mxnetexamples/javaapi/benchmark/ObjectDetectionBenchmark.java:35: error: not found: type NDArray$
    private NDArray$ NDArray = NDArray$.MODULE$;                                
            ^                                                                                   
/home/gustavo/git/incubator-mxnet/scala-package/examples/src/main/java/org/apache/mxnetexamples/javaapi/infer/bert/BertQA.java:52: error: not found: type NDArray$
    private static NDArray$ NDArray = NDArray$.MODULE$;
                   ^
/home/gustavo/git/incubator-mxnet/scala-package/examples/src/main/java/org/apache/mxnetexamples/javaapi/infer/predictor/PredictorExample.java:48: error: not found: type NDArray$
    private static NDArray$ NDArray = NDArray$.MODULE$;

Any pointers?
Thanks!

Yeah, I've stumbled on this one as well. Come to think of it, maybe I haven't actually been able to generate bindings for 2.12 馃

hey @lanking520 I got some progress: I switched to Scala 2.12.12 and fixed the dependencies. Everything compiles (with some warnings though) but when it comes to compile the examples project I get this:

[INFO] --- scala-maven-plugin:3.4.4:doc-jar (compile) @ mxnet-examples ---              
/home/gustavo/git/incubator-mxnet/scala-package/examples/src/main/java/org/apache/mxnetexamples/javaapi/benchmark/ObjectDetectionBenchmark.java:35: error: not found: type NDArray$
    private NDArray$ NDArray = NDArray$.MODULE$;                                
            ^                                                                                   
/home/gustavo/git/incubator-mxnet/scala-package/examples/src/main/java/org/apache/mxnetexamples/javaapi/infer/bert/BertQA.java:52: error: not found: type NDArray$
    private static NDArray$ NDArray = NDArray$.MODULE$;
                   ^
/home/gustavo/git/incubator-mxnet/scala-package/examples/src/main/java/org/apache/mxnetexamples/javaapi/infer/predictor/PredictorExample.java:48: error: not found: type NDArray$
    private static NDArray$ NDArray = NDArray$.MODULE$;

Any pointers?
Thanks!

Acutally this is used for Java to access the NDArray class.

So, is there a way to get pass the issue with the NDArray?

I haven't had spare time to see this! Still not sure why is this happening, I believe that the POM project is not generating the Java byte code in the correct path.

Was this page helpful?
0 / 5 - 0 ratings