There are some more quantized models available from C2 model zoo which we should be able to load in Glow:
I will take a look at these two models.
There are multiple libraries supporting quantization. The quantized operators we supported now are for supporting server side quantization with DNNLOWP. But these two models uses the Caffe2's Int8* operators lib for mobile quantization. That is, the Int8 ops set is different from what we've supported. For example, in these 2 models, there is a "Int8Softmax", which is not used/supported in server side quantization. (I haven't checked that the ops with same name in two libraries have the same functionality yet). If I remember it correctly, Glow is targeting server side. So, do we still need to support the operator library of these 2 models ? Thanks! @rdzhabarov
If I remember it correctly, Glow is targeting server side. So, do we still need to support the operator library of these 2 models
If it's mobile only I'd skip those. One thing which worthwhile doing is to check list of ops and see what would make sense to have in our C2 loaders. Maybe look at some other real workloads with quantized server models.
@rdzhabarov Thanks!
This page lists the quantization ops used in server models: https://github.com/pytorch/pytorch/tree/master/caffe2/quantization/server#quantization-operators
Then I will close this issue and create another one to track implementing these quantization ops.
Most helpful comment
@rdzhabarov Thanks!
This page lists the quantization ops used in server models: https://github.com/pytorch/pytorch/tree/master/caffe2/quantization/server#quantization-operators
Then I will close this issue and create another one to track implementing these quantization ops.