Onnxruntime: Core Dump with Java API for BERT

Created on 10 Sep 2020  路  8Comments  路  Source: microsoft/onnxruntime

Describe the bug
A clear and concise description of what the bug is.
I am trying to follow the sample code of java to deploy the Bert model. However, it raise me a core dump error.

Urgency
If there are particular important use cases blocked by this or strict project-related timelines, please share more information and dates. If there are no hard deadlines, please specify none.

System information
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): MacOS 10.14.6
ONNX Runtime installed from (source or binary):pip install --quiet --upgrade onnxruntime==1.4.0
ONNX Runtime version: pip install --quiet --upgrade onnxruntime-tools==1.4.0
Python version:3.7.7
Visual Studio version (if applicable):
GCC/Compiler version (if compiling from source):
CUDA/cuDNN version:
GPU model and memory:

To Reproduce

  • Describe steps/code to reproduce the behavior.
  • Attach the ONNX model to the issue (where applicable) to expedite investigation.

the code to reproduce the bug and model is attached here

Expected behavior
A clear and concise description of what you expected to happen.
run smoothly and give me the probability output, and faster than tf serving.
Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here. If the issue is about a particular model, please share the model details as well to facilitate debugging.

Java bug

Most helpful comment

I think there are some build system changes between the different ONNX Runtime versions, but the Python and Java APIs should be similar performance. What bits are you measuring (e.g. does it include the input tensor creation, the output tensor extraction or just the run call)? And are you letting the JVM warm up a bit first? Python doesn't have as much of a warmup effect, but the first call to run on the JVM is likely to be much slower than the hundredth, or thousandth call to run. Also the native interop is different between the official version and the JavaCPP version, so we'll need to know which one you're measuring.

All 8 comments

What version of ONNX Runtime are you using?

There's a reference to 1.2.0 in the JVM crash log, and I get an exception thrown by the code when running against 1.4.0/1.3.1 because it's asking for output 1 when only output 0 exists. Changing the code with the following diff makes it execute fine on my Mac, using ONNX Runtime 1.4.0 CPU and 1.3.1 CPU (I also modified it to accept the model path as an argument, and it's running using single source file execution on Java 14, but those should be irrelevant).

diff:

133,134c133,134
<                     logger.log(Level.INFO, "Output type = " + output.get(0).toString());
<                     logger.log(Level.INFO, "Output value = " + output.get(0).getValue().toString());
---
>                     logger.log(Level.INFO, "Output type = " + output.get(1).toString());
>                     logger.log(Level.INFO, "Output value = " + output.get(1).getValue().toString());

output:

apocock@apocock-mac:~/Downloads/onnx-test$ java -cp ~/.m2/repository/com/microsoft/onnxruntime/onnxruntime/1.4.0/onnxruntime-1.4.0.jar onnx_trail/src/test/java/sample/ScoreMNIST.java model.onnx 
Sep 10, 2020 10:13:57 AM sample.ScoreMNIST main
INFO: Loading model from model.onnx
Sep 10, 2020 10:13:57 AM sample.ScoreMNIST main
INFO: Inputs:
Sep 10, 2020 10:13:57 AM sample.ScoreMNIST main
INFO: NodeInfo(name=segment_ids_1:0,info=TensorInfo(javaType=INT32,onnxType=ONNX_TENSOR_ELEMENT_DATA_TYPE_INT32,shape=[-1, 32]))
Sep 10, 2020 10:13:57 AM sample.ScoreMNIST main
INFO: NodeInfo(name=label_ids_1:0,info=TensorInfo(javaType=INT32,onnxType=ONNX_TENSOR_ELEMENT_DATA_TYPE_INT32,shape=[-1]))
Sep 10, 2020 10:13:57 AM sample.ScoreMNIST main
INFO: NodeInfo(name=input_mask_1:0,info=TensorInfo(javaType=INT32,onnxType=ONNX_TENSOR_ELEMENT_DATA_TYPE_INT32,shape=[-1, 32]))
Sep 10, 2020 10:13:57 AM sample.ScoreMNIST main
INFO: NodeInfo(name=input_ids_1:0,info=TensorInfo(javaType=INT32,onnxType=ONNX_TENSOR_ELEMENT_DATA_TYPE_INT32,shape=[-1, 32]))
Sep 10, 2020 10:13:57 AM sample.ScoreMNIST main
INFO: Outputs:
Sep 10, 2020 10:13:57 AM sample.ScoreMNIST main
INFO: NodeInfo(name=loss/Sigmoid:0,info=TensorInfo(javaType=FLOAT,onnxType=ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT,shape=[-1, 6299]))
Sep 10, 2020 10:13:57 AM sample.ScoreMNIST main
INFO: Output type = OnnxTensor(info=TensorInfo(javaType=FLOAT,onnxType=ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT,shape=[1, 6299]))
Sep 10, 2020 10:13:57 AM sample.ScoreMNIST main
INFO: Output value = [[F@3d285d7e
Sep 10, 2020 10:13:57 AM sample.ScoreMNIST main
INFO: Time taken for inference of ONNX model5
Sep 10, 2020 10:13:57 AM sample.ScoreMNIST main
INFO: DONE

Ahh. The crash dump says you're using the JavaCPP fork. You should take that up with @saudet. The official release works fine.

I can also confirm that it works fine with ONNX Runtime 1.4.0 on Mac with these changes:

--- pom.xml_org 2020-08-17 14:42:09.000000000 +0900
+++ pom.xml 2020-09-11 12:48:26.040687529 +0900
@@ -29,19 +29,19 @@
         <dependency>
             <groupId>org.bytedeco</groupId>
             <artifactId>onnxruntime</artifactId>
-            <version>1.2.0-1.5.3</version>
+            <version>1.4.0-1.5.4</version>
         </dependency>

         <dependency>
             <groupId>org.bytedeco</groupId>
             <artifactId>onnx-platform</artifactId>
-            <version>1.6.0-1.5.3</version>
+            <version>1.7.0-1.5.4</version>
         </dependency>

         <dependency>
             <groupId>org.bytedeco</groupId>
             <artifactId>onnxruntime-platform</artifactId>
-            <version>1.2.0-1.5.3</version>
+            <version>1.4.0-1.5.4</version>
         </dependency>
     </dependencies>

@@ -50,4 +50,4 @@
         <maven.compiler.target>1.8</maven.compiler.target>
     </properties>

Ahh. The crash dump says you're using the JavaCPP fork. You should take that up with @saudet. The official release works fine.

BTW, it's not a fork, it's a distribution. In the Python world, this is equivalent to, for example, Anaconda, and no one refers to Anaconda as a fork of all its packages. It's a distribution. Let's use the correct words.

Thanks @saudet @Craigacp
It works fine after I made the change on dependency.
One questions regarding the Java api vs python api. The python api shows better performance averagely (0.98 ms on python vs 2 on Java) with onnx runntime 1.4. Is that a common case as well?

I think there are some build system changes between the different ONNX Runtime versions, but the Python and Java APIs should be similar performance. What bits are you measuring (e.g. does it include the input tensor creation, the output tensor extraction or just the run call)? And are you letting the JVM warm up a bit first? Python doesn't have as much of a warmup effect, but the first call to run on the JVM is likely to be much slower than the hundredth, or thousandth call to run. Also the native interop is different between the official version and the JavaCPP version, so we'll need to know which one you're measuring.

It was the warmup Issue, Thanks for your detailed explain!

What's wrong with https://github.com/microsoft/onnxruntime/issues/6031 ? Thank you very much! @wppply

Was this page helpful?
0 / 5 - 0 ratings