Incubator-mxnet: Multi-threaded execution leads to high CPU load

Created on 13 Mar 2019 · 3Comments · Source: apache/incubator-mxnet

In my code , I have five threads, for each thread I have generate a Executor just as follows:
std::unique_ptr exec
exec_.reset(network_.SimpleBind(ctx, args_,grad_store,grad_req_type));
but when I run the program,I found the cpu load is very high. I want to know Whether each thread generates a separate Executor engine，and each Executor engine generate many threads. and
if I want to just generate one Executor engine, and the threads use the same Executor engine, how to code in my program?

C++ Performance

Source

songziqin

Most helpful comment

@songziqin
This is based on my understanding of C++ API.
It is possible to create Executor engine in one thread and share it across many thread. However, the executor engine will run the forward pass on only one input at a time. That is, for a given input, the following 3 operations should be performed atomically for the correct inference

Setting the input for the Executor
Running the forward pass . Executor->Forward()
Retrieving the output from the executor.

With this approach your application will be able to process multiple inputs but the inference operation will be serialized.

With some modifications the inception_inference.cpp example can be used to process multiple images. The object of Predictor class can be created by a single thread and shared across multiple threads. The calls to "PredictImage()" need to be synchronized so that Executor in Predictor object processes one image at a time.

I hope this helps.

@mxnet-label-bot add [Pending Requester Info]

leleamol on 14 Mar 2019

👍3

All 3 comments

@mxnet-label-bot add [Performance, c++, question]

Adding labels for better visibility. You might try asking this question on https://discuss.mxnet.io/ instead. Questions there tend to get better visibility as github is primarily used to track issues.