GTX 1080
mobilenet0.25
I run test.py with the image (640*640锛塧nd found that the detect time is 120ms.
But I found the forward time is near 0.3ms. It's confused me. Is there any tricks to speed up detect time.?
@zyg11 The slowness happens AFTER the forward time where we're trying to use the network's output to find the exact location of faces. the Cython parts of the code speeds up the process a bit but overall I think this part needs the most optimization.
I got it. Thanks !
I cant download this mobilenet0.25 retinaface model, could you please give a download link? thanks@
You can found it in the Third-party Models which made from yangfly!
@zyg11 The slowness happens AFTER the forward time where we're trying to use the network's output to find the exact location of faces. the Cython parts of the code speeds up the process a bit but overall I think this part needs the most optimization.
No, the slowness happens exactly at the forward command, because the ndarray ops executes in asynchornous mode. if you add nd.waitall() just after forward, you will see the inference time is about 3s, not 3ms as public
It is the warmup time cost. You should not include the first inference time for the performance test.
It is the warmup time cost. You should not include the first inference time for the performance test.
Hi @nttstar,
Is there any way to perform the warmup only one time to detect many images ? It seems when I call the detect function, it will warmup again.
Experiencing same effect than @khanhnt
It is the warmup time cost. You should not include the first inference time for the performance test.
Hi @nttstar,
Is there any way to perform the warmup only one time to detect many images ? It seems when I call the detect function, it will warmup again.
Hi @khanhnt , I encountered the same problem, have you soloved the problem?
@zyg11 I faced the same issue. Did you solve it?
Same issue for me on GPU and CPU
Most helpful comment
Hi @nttstar,
Is there any way to perform the warmup only one time to detect many images ? It seems when I call the detect function, it will warmup again.