Onnxruntime: How to choose CPU/GPU as the onnxruntime engine?

Created on 15 Jan 2019 · 9Comments · Source: microsoft/onnxruntime

Is your feature request related to a problem? Please describe.
I am testing the performance of onnx runtime on a machine with both CPU and GPU. Since I have installed both MKL-DNN and TensorRT, I am confused about whether my model is run on CPU or GPU. I have installed the packages onnxruntime and onnxruntime-gpu form pypi.

System information

ONNX Runtime version (you are using):
onnxruntime 0.1.3 and onnxruntime-gpu 0.1.3

Describe the solution you'd like
I want to choose the engine by myself through a python API.

Python enhancement

Source

LexXia

👍1

Most helpful comment

You can try onnxruntime.get_device() to see which compilation settings were used. It returns the BACKEND_SETTING: https://github.com/Microsoft/onnxruntime/blob/master/onnxruntime/python/onnxruntime_pybind_state.cc#L43. For your issue, I think the best is now to use two different python environments.

xadupre on 16 Jan 2019

👍2

All 9 comments

For python, it's not supported yet.

snnn on 15 Jan 2019

Not supported yet.

Is there any method to find which device is practically used? Like the "log_device_placement" config in the TF?

LexXia on 16 Jan 2019

Not supported yet.

Is there any method to find which device is practically used? Like the "log_device_placement" config in the TF?

No. Usually I set a breakpoint to verify it.
It's a good suggestion, we should add a visual tool.

snnn on 16 Jan 2019

xadupre on 16 Jan 2019

👍2

Hi @xadupre,

get_device() solved the problem partially. For example, even CUDA is enabled, it is still possible that some ops(like conv) are still running on CPU.

snnn on 16 Jan 2019

So you would like to know which device is used for every node of the graph?

xadupre on 16 Jan 2019

Could onnxruntime choose the fastest provider to run op automatically now?
For example, conv.

fumihwh on 11 Feb 2019

@fumihwh Thank you Wenhao! That's the goal ONNXRUNTIME wants to achieve. ONNXRUNTIME has a partitioning API designed in-place to be able to try to do a best graph partitioning (node assignment) per each execution provider's capability. However, the current implementation of partitioning is a simple greedy algo based on execution provider preferences customers set. That means, if the execution provider preferences are "TensorRT", "Cuda", "CPU", then nodes will be assigned to TensorRT, Cuda, CPU in order per their capability results returned.

linkerzhang on 18 Apr 2019

Closing this issue.

Specifying execution provider via python API is tracked here: https://github.com/microsoft/onnxruntime/issues/486
"smarter" execution provider selection (beyond greedy algo) may be a future feature