This issue shows the status of operators supported in luci-interpreter. I've listed the operators in three popular models: mobilenetV2 (M), inceptionV3 (I), and ResNet50 (R) (ResNet is a top priority for post-training quantization #696).
Most of the operators in the list are already supported in the draft of luci-interpreter (#205), but I found that Mean and Pad were not implemented yet. For those who want to contribute to luci-interpreter, it would be more helpful to support them first.
| Operators | Status (float) | Status (u8) | Model |
| ------------- | ------------- | ------------- | ------------- |
Add | O聽@s-barannikov | O @s-barannikov 聽 | R, M
AvgPool2D | O @s-barannikov | O @s-barannikov | M, I
Concat | O @s-barannikov 聽 | O @s-barannikov 聽 | I
Conv2D | O @s-barannikov 聽 | O @s-barannikov 聽 | R, M, I
DepthwiseConv2D | O @s-barannikov 聽 | O @s-barannikov 聽 | M
FullyConnected | O @s-barannikov 聽 | 聽 | I
MaxPool2D | O @s-barannikov 聽 | O @s-barannikov 聽 | R, I
Mean | @karthik-pen (#1669)聽 | @karthik-pen (#1669) | R
Pad | O @s-barannikov (#1509) | O @s-barannikov (#1509) | R
Reshape | O @s-barannikov 聽 | O @s-barannikov 聽 | M
Softmax | O @s-barannikov 聽 | 聽 | M, I
ArgMax | @struss (#1691 ) | @struss (#1691 )聽 | R
Update 2020/05/27: ArgMax operator was added to the table.
@jinevening Can I take up the implementation of Mean?
I will implement Pad.
@karthik-pen , @s-barannikov , what if you write your github id in the appropriate cell of the table and mark it in progress? :)
Execution time of each model in luci-interpreter
(measured in Ubuntu 18.04 x86 desktop equipped with i7-9700 3.0 GHz)
Model聽 | Debug mode | Release mode
-- | -- | --
InceptionV3 | 114 sec | 5.2 0.8 sec
MobileNetV2 | 6.9 sec | 0.4 0.1 sec
ResNet50 | 24 sec | 0.7 sec
(ResNet50 will be updated later) ResNet50 result was added 2020/06/03
luci-interpreter will run ~1,000 data samples to profile moving avg of min/max values for post-training quantization #696. For InceptionV3, the execution time for 1,000 data is ~5200 sec (~1 hour 27 min). Even considering the time to record min/max values, I expect that the profiling would finish within several hours.
Update (2020/05/26): Thanks to @s-barannikov 's work (#1438), the execution time for InceptionV3 was reduced from 5.2s to 0.8s. Now, the profiling may take just ~15 minutes 馃憤 .
From what I've seen, most of the time is spent on Conv2D. At least that operation needs to use optimized kernel (instead of reference), but it requires some additional steps to take (allocate Im2Col tensor and possibly create CpuContext to enable parallelism).
@jinevening BTW Is it the time spent on "interpret()" call only?
BTW Is it the time spent on "interpret()" call only?
@s-barannikov No, I measured the whole time spent for running luci_eval_tester. This includes the time to read input data from the file, run the interpreter, write the output to the file.
$ time build/release/compiler/luci-value-test/tester/luci_eval_tester build/release/compiler/luci-value-test/inception_v3.circle 1 build/release/compiler/luci-value-test/inception_v3.circle.input build/release/compiler/luci-value-test/inception_v3.circle.output
real 0m0.773s
user 0m0.633s
sys 0m0.140s
real 0m0.773s
Looks better!
@jinevening did you try run network in multithread mode?
@s-barannikov how to enable parallel execution?
@jinevening did you try run network in multithread mode?
No. I'm not sure if the Conv2D kernel runs with multi threads.
@s-barannikov how to enable parallel execution?
There is no way to control it yet. Quantized op is always parallelized and floating kernel is single-threaded.
@karthik-pen Can you tell me the progress of Mean?
@karthik-pen Can you tell me the progress of Mean?
I will post the draft in some time. Sorry for the delay. Here it is: #1665
Edit: Created new PR #1669 for this.
All operators in ResNet50, InceptionV3, MobileNetV2 are now supported 馃憦 .
Look at the above table to see the execution time of each model.
Thanks! @s-barannikov @karthik-pen @struss
I close this issue because all of the target operators are supported and no more issues have been raised.
Most helpful comment
All operators in ResNet50, InceptionV3, MobileNetV2 are now supported 馃憦 .
Look at the above table to see the execution time of each model.
Thanks! @s-barannikov @karthik-pen @struss